Automate the Boring Stuff in IR with Python¶

2022 NEAIR Workshop Series¶

August 3, 2022¶

Mark Green¶

Holy Family University¶
Executive Director of Institutional Effectiveness¶

Schedule for time together¶

  • Introduction 👋
  • Getting Started with Python 🛫
  • Setting up your "envrionment" 🎚️
  • Analyzing data 📊
  • Starting to automate 🤖

Goals for today!¶

We're going to reivew:

  • What is Python and its benefits for data analysis...and more importantly automation 🐍
  • Installing Python on your machine and the common "packages" that we'll use in data analysis 📦
  • Read a data from a csv, webpage, and pdf 👁️‍🗨️
  • Review how to clean columns and data tables 🧹
  • Save data into a formatted Excel Workbook 💾
  • Write a script for future replication 📜
  • Discuss some next steps 🔮

Alt Text

What is Python?¶

"Python is a clear and powerful object-oriented programming language..." (wiki.python.org/moin/BeginnersGuide/Overview)

image.png

What is Python?¶

Python is an open-source programming language that has packages built for data analysis.
It was designed for programming first and then the data analysis came second.
It's built on top of C which means that is a bit slower than most programming languages -- but, it is a lot easier to read and write

image.png

Why do we use it for data analysis?¶

  • Easy-to-use language that makes it easy to "run" 🏃‍♂️
  • Large library of packages for data analysis 📚
  • It's open-source (think, "free") 💰
  • Easy to learn 💡
  • Flexible 🧘‍♂️
  • Fun! 🎉

Why do we use Python?¶

My personal search for a data analysis tool 🚶‍♂️

Code allows for documentation and reproducibility 📄

We can utilize other resources of the computer 💻

image.png

Installing Python¶

https://www.python.org/downloads/

Apple Computers
https://www.groovypost.com/howto/install-pip-on-a-mac/

Windows Computers
https://www.activestate.com/resources/quick-reads/how-to-install-pip-on-windows/

Installing Pacakges¶

Windows Key + r
Type 'cmd' + enter

  • pip install numpy
  • pip install pandas
  • pip install jupyterlab
  • pip install statsmodels.formula.api
  • pip install matplotlib
  • pip install seaborn
  • pip install -q tabula-py
Downloading the files¶

https://github.com/mgreen216/automateir

Save your files to one folder location example: "C:\users\\[user_name]\Documents\Automate"

Jupyter Notebook and Lab¶

  • Allows to run code in a "cell"
  • Good for data analysis and testing

Starting Jupyter Lab¶

Windows Key + r
Type 'cmd' + enter
Type 'jupyter-lab' + enter

In [2]:
print('Hello, world!')
Hello, world!

Quick check-in?
image.png

In [3]:
# Build your first function! 

def square_list(x):
  return(x**2)

for i in range(6):
  print(i, square_list(i))
0 0
1 1
2 4
3 9
4 16
5 25
In [4]:
# Build your first function! 

def square_list(x):
  return(x**2)

for i in range(6):
  print(i, square_list(i))
0 0
1 1
2 4
3 9
4 16
5 25

Python Data Structures¶

  • Lists
In [5]:
newList = ['Philadelphia', 'Green', 'Seven']
print(newList)
['Philadelphia', 'Green', 'Seven']
In [6]:
newList.append('PA')
print(newList)
['Philadelphia', 'Green', 'Seven', 'PA']

Dictionaries¶

  • Ordered pairs
  • Called Keys and Values
  • Can store text and string
In [7]:
myDict = {1: 'One', 2: 'Two', 3: 3}
In [8]:
myDict[2]
Out[8]:
'Two'
In [9]:
myDict.values()
Out[9]:
dict_values(['One', 'Two', 3])
In [10]:
myDict.keys()
Out[10]:
dict_keys([1, 2, 3])

The Packages¶

  • Numpy
  • Pandas
  • Scikit-learn

Numpy¶

  • A package to perform vector manipulation
  • This is a single column in our Excel spreadsheet
In [11]:
import numpy as np
In [12]:
newArray = np.array(newList)
print(newArray)
['Philadelphia' 'Green' 'Seven' 'PA']
In [13]:
test_list = [1, 2, 3] + [4, 5, 6]
test_array = np.array([1, 2, 3]) + np.array([4, 5, 6])
print(test_list)
print(test_array)
[1, 2, 3, 4, 5, 6]
[5 7 9]

Pandas¶

  • A package that allows us to put a series of vectors into a dataframe
  • A dataframe is our matrix or, individual spreadsheet from Excel
Dataframes¶
  • Used for statisitcal analysis

A widely used Python package. It provides data structures suitable for statisical analysis, and adds functions that facilitate data input, data organization, data manipulation.

In [14]:
import pandas as pd
import numpy as np
import seaborn as sns
%matplotlib inline

# We'll create a dataframe from 3 different arrays
c = np.arange(1,11,0.1) # This creates an array starting at 1 ending before 11, that increases by 0.1
x = np.reciprocal(c).round(1) # This is the reciprocal of the c array we just created
y = np.log10(c).round(1) # Log of our array c


df = pd.DataFrame({'Counter': c, 'x':x, 'y':y})
In [15]:
# Let's check the top 5 values of the dataframe
df.head()
Out[15]:
Counter x y
0 1.0 1.0 0.0
1 1.1 0.9 0.0
2 1.2 0.8 0.1
3 1.3 0.8 0.1
4 1.4 0.7 0.1
In [16]:
# There are two ways to value the values of a signle column
df['Counter']
Out[16]:
0      1.0
1      1.1
2      1.2
3      1.3
4      1.4
      ... 
95    10.5
96    10.6
97    10.7
98    10.8
99    10.9
Name: Counter, Length: 100, dtype: float64
In [17]:
df.Counter
Out[17]:
0      1.0
1      1.1
2      1.2
3      1.3
4      1.4
      ... 
95    10.5
96    10.6
97    10.7
98    10.8
99    10.9
Name: Counter, Length: 100, dtype: float64
In [18]:
# We can also slice a column by calling the colunm names in a list 

data = df[['Counter', 'y']]
data.head()
Out[18]:
Counter y
0 1.0 0.0
1 1.1 0.0
2 1.2 0.1
3 1.3 0.1
4 1.4 0.1
In [19]:
# We can also slice and anaylze sections of a dataframe through referencing the index
df[4:10]
Out[19]:
Counter x y
4 1.4 0.7 0.1
5 1.5 0.7 0.2
6 1.6 0.6 0.2
7 1.7 0.6 0.2
8 1.8 0.6 0.3
9 1.9 0.5 0.3
In [20]:
df[['Counter', 'y']][4:10]
Out[20]:
Counter y
4 1.4 0.1
5 1.5 0.2
6 1.6 0.2
7 1.7 0.2
8 1.8 0.3
9 1.9 0.3
In [21]:
df.iloc[4:10, [0,2]]
Out[21]:
Counter y
4 1.4 0.1
5 1.5 0.2
6 1.6 0.2
7 1.7 0.2
8 1.8 0.3
9 1.9 0.3
In [22]:
# We can get an array of arrays by calling the values function
df.values
Out[22]:
array([[ 1. ,  1. ,  0. ],
       [ 1.1,  0.9,  0. ],
       [ 1.2,  0.8,  0.1],
       [ 1.3,  0.8,  0.1],
       [ 1.4,  0.7,  0.1],
       [ 1.5,  0.7,  0.2],
       [ 1.6,  0.6,  0.2],
       [ 1.7,  0.6,  0.2],
       [ 1.8,  0.6,  0.3],
       [ 1.9,  0.5,  0.3],
       [ 2. ,  0.5,  0.3],
       [ 2.1,  0.5,  0.3],
       [ 2.2,  0.5,  0.3],
       [ 2.3,  0.4,  0.4],
       [ 2.4,  0.4,  0.4],
       [ 2.5,  0.4,  0.4],
       [ 2.6,  0.4,  0.4],
       [ 2.7,  0.4,  0.4],
       [ 2.8,  0.4,  0.4],
       [ 2.9,  0.3,  0.5],
       [ 3. ,  0.3,  0.5],
       [ 3.1,  0.3,  0.5],
       [ 3.2,  0.3,  0.5],
       [ 3.3,  0.3,  0.5],
       [ 3.4,  0.3,  0.5],
       [ 3.5,  0.3,  0.5],
       [ 3.6,  0.3,  0.6],
       [ 3.7,  0.3,  0.6],
       [ 3.8,  0.3,  0.6],
       [ 3.9,  0.3,  0.6],
       [ 4. ,  0.2,  0.6],
       [ 4.1,  0.2,  0.6],
       [ 4.2,  0.2,  0.6],
       [ 4.3,  0.2,  0.6],
       [ 4.4,  0.2,  0.6],
       [ 4.5,  0.2,  0.7],
       [ 4.6,  0.2,  0.7],
       [ 4.7,  0.2,  0.7],
       [ 4.8,  0.2,  0.7],
       [ 4.9,  0.2,  0.7],
       [ 5. ,  0.2,  0.7],
       [ 5.1,  0.2,  0.7],
       [ 5.2,  0.2,  0.7],
       [ 5.3,  0.2,  0.7],
       [ 5.4,  0.2,  0.7],
       [ 5.5,  0.2,  0.7],
       [ 5.6,  0.2,  0.7],
       [ 5.7,  0.2,  0.8],
       [ 5.8,  0.2,  0.8],
       [ 5.9,  0.2,  0.8],
       [ 6. ,  0.2,  0.8],
       [ 6.1,  0.2,  0.8],
       [ 6.2,  0.2,  0.8],
       [ 6.3,  0.2,  0.8],
       [ 6.4,  0.2,  0.8],
       [ 6.5,  0.2,  0.8],
       [ 6.6,  0.2,  0.8],
       [ 6.7,  0.1,  0.8],
       [ 6.8,  0.1,  0.8],
       [ 6.9,  0.1,  0.8],
       [ 7. ,  0.1,  0.8],
       [ 7.1,  0.1,  0.9],
       [ 7.2,  0.1,  0.9],
       [ 7.3,  0.1,  0.9],
       [ 7.4,  0.1,  0.9],
       [ 7.5,  0.1,  0.9],
       [ 7.6,  0.1,  0.9],
       [ 7.7,  0.1,  0.9],
       [ 7.8,  0.1,  0.9],
       [ 7.9,  0.1,  0.9],
       [ 8. ,  0.1,  0.9],
       [ 8.1,  0.1,  0.9],
       [ 8.2,  0.1,  0.9],
       [ 8.3,  0.1,  0.9],
       [ 8.4,  0.1,  0.9],
       [ 8.5,  0.1,  0.9],
       [ 8.6,  0.1,  0.9],
       [ 8.7,  0.1,  0.9],
       [ 8.8,  0.1,  0.9],
       [ 8.9,  0.1,  0.9],
       [ 9. ,  0.1,  1. ],
       [ 9.1,  0.1,  1. ],
       [ 9.2,  0.1,  1. ],
       [ 9.3,  0.1,  1. ],
       [ 9.4,  0.1,  1. ],
       [ 9.5,  0.1,  1. ],
       [ 9.6,  0.1,  1. ],
       [ 9.7,  0.1,  1. ],
       [ 9.8,  0.1,  1. ],
       [ 9.9,  0.1,  1. ],
       [10. ,  0.1,  1. ],
       [10.1,  0.1,  1. ],
       [10.2,  0.1,  1. ],
       [10.3,  0.1,  1. ],
       [10.4,  0.1,  1. ],
       [10.5,  0.1,  1. ],
       [10.6,  0.1,  1. ],
       [10.7,  0.1,  1. ],
       [10.8,  0.1,  1. ],
       [10.9,  0.1,  1. ]])
In [23]:
# Or a single array by calling it on a single column
df.y.values
Out[23]:
array([0. , 0. , 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3,
       0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
       0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6, 0.7, 0.7, 0.7, 0.7,
       0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.8, 0.8, 0.8, 0.8, 0.8,
       0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.9, 0.9, 0.9, 0.9,
       0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9,
       0.9, 0.9, 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ,
       1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. , 1. ])

Grouping¶

  • Pandas offers functions to handle data, including missing data. nan's {"Not-A-Number"}
  • It allows for pivoting for more efficient data manipulation
  • You can group respondents by gender
In [24]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
In [25]:
data = pd.DataFrame({
    'Sex': ['F', 'F', 'M', 'F', 'M',
              'M', 'F','M', 'F', 'M', 'M'],
    'GPA' : [3.41, 3.53, 2.67, 3.19, 4., 2.15, 3.26, 3.91, 3.75, 2.14, 3.67]
})

grouped = data.groupby('Sex')
grouped.describe()
Out[25]:
GPA
count mean std min 25% 50% 75% max
Sex
F 5.0 3.428 0.223204 3.19 3.26 3.41 3.53 3.75
M 6.0 3.090 0.871711 2.14 2.28 3.17 3.85 4.00
In [26]:
grouped.boxplot()
plt.show()
In [27]:
# Show only the female 
df_F = grouped.get_group('F')
df_F
Out[27]:
Sex GPA
0 F 3.41
1 F 3.53
3 F 3.19
6 F 3.26
8 F 3.75
In [28]:
# We then can call descriptive statistics 

df_F.std()
Out[28]:
GPA    0.223204
dtype: float64
In [29]:
import numpy as np
import pandas as pd
import statsmodels.formula.api as sm
import seaborn as sns
In [30]:
# Generate a noisy line and save the data in a pandas dataframe
x = np.arange(100)
y = 2.25*x - 6 + np.random.randn(len(x))

df = pd.DataFrame({'x':x, 'y':y})

sns.regplot(x = x, y = y, data = df)
plt.show()
In [31]:
# Generate a noisy line and save the data in a pandas dataframe
x = np.arange(100)
y = 2.25*x - 6 + np.random.randn(len(x))

df = pd.DataFrame({'x':x, 'y':y})

sns.regplot(x = x, y = y, data = df)
plt.show()

image.png

IPEDS Data 👩🏽‍🔬¶

In [32]:
# These are the common ways to import the packages as pd, np, etc... you can import them, however; you'd like. 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# We are setting our style for printing graphics. 
sns.set_style('whitegrid')

Load the Data 👨🏽‍💻¶

In [34]:
# We will read a csv file. The ecoding is not always needed. It helps with a downloaded file
df = pd.read_csv('hd2020.csv', encoding = 'latin')

Clean and Explore the Dataset 🧽¶

In [35]:
df.head() # Calls the first 5 rows
Out[35]:
UNITID INSTNM IALIAS ADDR CITY STABBR ZIP FIPS OBEREG CHFNM ... CBSATYPE CSA NECTA COUNTYCD COUNTYNM CNGDSTCD LONGITUD LATITUDE DFRCGID DFRCUSCG
0 100654 Alabama A & M University AAMU 4900 Meridian Street Normal AL 35762 1 5 Dr. Andrew Hugine, Jr. ... 1 290 -2 1089 Madison County 105 -86.568502 34.783368 109 1
1 100663 University of Alabama at Birmingham Administration Bldg Suite 1070 Birmingham AL 35294-0110 1 5 Ray L. Watts ... 1 142 -2 1073 Jefferson County 107 -86.799345 33.505697 95 1
2 100690 Amridge University Southern Christian University Regions University 1200 Taylor Rd Montgomery AL 36117-3553 1 5 Michael C.Turner ... 1 388 -2 1101 Montgomery County 102 -86.174010 32.362609 126 2
3 100706 University of Alabama in Huntsville UAH University of Alabama Huntsville 301 Sparkman Dr Huntsville AL 35899 1 5 Darren Dawson ... 1 290 -2 1089 Madison County 105 -86.640449 34.724557 99 2
4 100724 Alabama State University 915 S Jackson Street Montgomery AL 36104-0271 1 5 Quinton T. Ross ... 1 388 -2 1101 Montgomery County 107 -86.295677 32.364317 118 1

5 rows × 73 columns

In [36]:
df.tail() # Returns the last 5 rows
Out[36]:
UNITID INSTNM IALIAS ADDR CITY STABBR ZIP FIPS OBEREG CHFNM ... CBSATYPE CSA NECTA COUNTYCD COUNTYNM CNGDSTCD LONGITUD LATITUDE DFRCGID DFRCUSCG
6435 496335 Coastline Beauty College - Hemet 2627 West Florida Avenue Suite 100 Hemet CA 92545-3661 6 8 ... 1 348 -2 6065 Riverside County 636 -116.999900 33.746000 -2 -2
6436 496371 Elite Welding Academy South Point 1910 County Road One South Point OH 45680-8849 39 3 Bob Reeves ... 1 170 -2 39087 Lawrence County 3906 -82.594354 38.447233 217 2
6437 496380 Medspa Academies - NIMA National Institute of ... 3993 Howard Hughes Parkway Suite 150 Las Vegas NV 89169-6745 32 8 ... 1 332 -2 32003 Clark County 3201 -115.158153 36.117261 -2 -2
6438 496414 TechSherpas 365 10213 Wilsky Blvd Tampa FL 33625 12 5 Della Wyler ... 1 -2 -2 12057 Hillsborough County 1214 -82.565846 28.042450 -1 -1
6439 496423 Zorganics Institute Beauty and Wellness ZORGANICS INSTITUTE 410 WEST BAKERVIEW ROAD SUITE 112 Bellingham WA 98226 53 8 Frida Emalange ... 1 -2 -2 53073 Whatcom County 5302 -122.494720 48.791194 206 2

5 rows × 73 columns

In [37]:
df.shape
Out[37]:
(6440, 73)
In [38]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6440 entries, 0 to 6439
Data columns (total 73 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   UNITID    6440 non-null   int64  
 1   INSTNM    6440 non-null   object 
 2   IALIAS    6439 non-null   object 
 3   ADDR      6439 non-null   object 
 4   CITY      6440 non-null   object 
 5   STABBR    6440 non-null   object 
 6   ZIP       6440 non-null   object 
 7   FIPS      6440 non-null   int64  
 8   OBEREG    6440 non-null   int64  
 9   CHFNM     6440 non-null   object 
 10  CHFTITLE  6440 non-null   object 
 11  GENTELE   6440 non-null   object 
 12  EIN       6440 non-null   int64  
 13  DUNS      6440 non-null   object 
 14  OPEID     6440 non-null   int64  
 15  OPEFLAG   6440 non-null   int64  
 16  WEBADDR   6440 non-null   object 
 17  ADMINURL  6440 non-null   object 
 18  FAIDURL   6440 non-null   object 
 19  APPLURL   6440 non-null   object 
 20  NPRICURL  6440 non-null   object 
 21  VETURL    6440 non-null   object 
 22  ATHURL    6440 non-null   object 
 23  DISAURL   6440 non-null   object 
 24  SECTOR    6440 non-null   int64  
 25  ICLEVEL   6440 non-null   int64  
 26  CONTROL   6440 non-null   int64  
 27  HLOFFER   6440 non-null   int64  
 28  UGOFFER   6440 non-null   int64  
 29  GROFFER   6440 non-null   int64  
 30  HDEGOFR1  6440 non-null   int64  
 31  DEGGRANT  6440 non-null   int64  
 32  HBCU      6440 non-null   int64  
 33  HOSPITAL  6440 non-null   int64  
 34  MEDICAL   6440 non-null   int64  
 35  TRIBAL    6440 non-null   int64  
 36  LOCALE    6440 non-null   int64  
 37  OPENPUBL  6440 non-null   int64  
 38  ACT       6440 non-null   object 
 39  NEWID     6440 non-null   int64  
 40  DEATHYR   6440 non-null   int64  
 41  CLOSEDAT  6440 non-null   object 
 42  CYACTIVE  6440 non-null   int64  
 43  POSTSEC   6440 non-null   int64  
 44  PSEFLAG   6440 non-null   int64  
 45  PSET4FLG  6440 non-null   int64  
 46  RPTMTH    6440 non-null   int64  
 47  INSTCAT   6440 non-null   int64  
 48  C18BASIC  6440 non-null   int64  
 49  C18IPUG   6440 non-null   int64  
 50  C18IPGRD  6440 non-null   int64  
 51  C18UGPRF  6440 non-null   int64  
 52  C18ENPRF  6440 non-null   int64  
 53  C18SZSET  6440 non-null   int64  
 54  C15BASIC  6440 non-null   int64  
 55  CCBASIC   6440 non-null   int64  
 56  CARNEGIE  6440 non-null   int64  
 57  LANDGRNT  6440 non-null   int64  
 58  INSTSIZE  6440 non-null   int64  
 59  F1SYSTYP  6440 non-null   int64  
 60  F1SYSNAM  6440 non-null   object 
 61  F1SYSCOD  6440 non-null   int64  
 62  CBSA      6440 non-null   int64  
 63  CBSATYPE  6440 non-null   int64  
 64  CSA       6440 non-null   int64  
 65  NECTA     6440 non-null   int64  
 66  COUNTYCD  6440 non-null   int64  
 67  COUNTYNM  6440 non-null   object 
 68  CNGDSTCD  6440 non-null   int64  
 69  LONGITUD  6440 non-null   float64
 70  LATITUDE  6440 non-null   float64
 71  DFRCGID   6440 non-null   int64  
 72  DFRCUSCG  6440 non-null   int64  
dtypes: float64(2), int64(49), object(22)
memory usage: 3.6+ MB
In [39]:
df.columns
Out[39]:
Index(['UNITID', 'INSTNM', 'IALIAS', 'ADDR', 'CITY', 'STABBR', 'ZIP', 'FIPS',
       'OBEREG', 'CHFNM', 'CHFTITLE', 'GENTELE', 'EIN', 'DUNS', 'OPEID',
       'OPEFLAG', 'WEBADDR', 'ADMINURL', 'FAIDURL', 'APPLURL', 'NPRICURL',
       'VETURL', 'ATHURL', 'DISAURL', 'SECTOR', 'ICLEVEL', 'CONTROL',
       'HLOFFER', 'UGOFFER', 'GROFFER', 'HDEGOFR1', 'DEGGRANT', 'HBCU',
       'HOSPITAL', 'MEDICAL', 'TRIBAL', 'LOCALE', 'OPENPUBL', 'ACT', 'NEWID',
       'DEATHYR', 'CLOSEDAT', 'CYACTIVE', 'POSTSEC', 'PSEFLAG', 'PSET4FLG',
       'RPTMTH', 'INSTCAT', 'C18BASIC', 'C18IPUG', 'C18IPGRD', 'C18UGPRF',
       'C18ENPRF', 'C18SZSET', 'C15BASIC', 'CCBASIC', 'CARNEGIE', 'LANDGRNT',
       'INSTSIZE', 'F1SYSTYP', 'F1SYSNAM', 'F1SYSCOD', 'CBSA', 'CBSATYPE',
       'CSA', 'NECTA', 'COUNTYCD', 'COUNTYNM', 'CNGDSTCD', 'LONGITUD',
       'LATITUDE', 'DFRCGID', 'DFRCUSCG'],
      dtype='object')
In [40]:
df.describe(include='all')
Out[40]:
UNITID INSTNM IALIAS ADDR CITY STABBR ZIP FIPS OBEREG CHFNM ... CBSATYPE CSA NECTA COUNTYCD COUNTYNM CNGDSTCD LONGITUD LATITUDE DFRCGID DFRCUSCG
count 6440.000000 6440 6439 6439 6440 6440 6440 6440.000000 6440.000000 6440 ... 6440.000000 6440.000000 6440.000000 6440.000000 6440 6440.000000 6440.000000 6440.000000 6440.000000 6440.000000
unique NaN 6319 2074 6358 2368 59 5738 NaN NaN 5620 ... NaN NaN NaN NaN 1051 NaN NaN NaN NaN NaN
top NaN Stevens-Henager College New York CA 00961 NaN NaN ... NaN NaN NaN NaN Los Angeles County NaN NaN NaN NaN NaN
freq NaN 7 4253 12 84 700 9 NaN NaN 87 ... NaN NaN NaN NaN 203 NaN NaN NaN NaN NaN
mean 286466.032764 NaN NaN NaN NaN NaN NaN 29.175932 4.656832 NaN ... 0.970963 264.446118 3627.057453 29229.583385 NaN 2926.969720 -90.507892 37.241228 101.106211 1.455590
std 139024.241067 NaN NaN NaN NaN NaN NaN 16.978805 2.211014 NaN ... 0.711930 181.658369 16015.889521 16992.453726 NaN 1700.057913 18.141579 5.941976 66.491820 0.877432
min 100654.000000 NaN NaN NaN NaN NaN NaN 1.000000 0.000000 NaN ... -2.000000 -2.000000 -2.000000 -2.000000 NaN -2.000000 -170.742774 -14.322636 -2.000000 -2.000000
25% 170050.500000 NaN NaN NaN NaN NaN NaN 13.000000 3.000000 NaN ... 1.000000 122.000000 -2.000000 13089.000000 NaN 1305.000000 -97.674073 33.911633 43.000000 1.000000
50% 221258.500000 NaN NaN NaN NaN NaN NaN 29.000000 5.000000 NaN ... 1.000000 297.000000 -2.000000 29189.000000 NaN 2907.000000 -86.392907 38.637173 99.000000 2.000000
75% 446568.000000 NaN NaN NaN NaN NaN NaN 42.000000 6.000000 NaN ... 1.000000 408.000000 -2.000000 42045.000000 NaN 4207.000000 -78.794307 41.250285 153.000000 2.000000
max 496423.000000 NaN NaN NaN NaN NaN NaN 78.000000 9.000000 NaN ... 2.000000 566.000000 79600.000000 78030.000000 NaN 7898.000000 171.378129 71.324702 231.000000 2.000000

11 rows × 73 columns

In [41]:
hd = df[['UNITID', 'INSTNM', 'IALIAS', 'CITY', 'STABBR', 'ZIP', 'CHFNM', 'CHFTITLE', 'OPEID', 'SECTOR', 'HBCU', 'HLOFFER', 'UGOFFER']]
In [42]:
hd.head()
Out[42]:
UNITID INSTNM IALIAS CITY STABBR ZIP CHFNM CHFTITLE OPEID SECTOR HBCU HLOFFER UGOFFER
0 100654 Alabama A & M University AAMU Normal AL 35762 Dr. Andrew Hugine, Jr. President 100200 1 1 9 1
1 100663 University of Alabama at Birmingham Birmingham AL 35294-0110 Ray L. Watts President 105200 1 2 9 1
2 100690 Amridge University Southern Christian University Regions University Montgomery AL 36117-3553 Michael C.Turner President 2503400 2 2 9 1
3 100706 University of Alabama in Huntsville UAH University of Alabama Huntsville Huntsville AL 35899 Darren Dawson President 105500 1 2 9 1
4 100724 Alabama State University Montgomery AL 36104-0271 Quinton T. Ross President 100500 1 1 9 1
In [43]:
for c in hd.iloc[:,1:].columns:
  print(str(c) + "\n" + str(hd[c].value_counts()))
INSTNM
Stevens-Henager College               7
Columbia College                      5
Brittany Beauty Academy               4
Arthur's Beauty College               4
Eastern Suffolk BOCES                 3
                                     ..
Cankdeska Cikana Community College    1
Northeastern State University         1
South Texas College                   1
Eastern University                    1
Virginia University of Lynchburg      1
Name: INSTNM, Length: 6319, dtype: int64
IALIAS
                                                                                                                                                                                                                                                                                                         4253
Ogle School                                                                                                                                                                                                                                                                                                 8
Northwest College - School of Beauty                                                                                                                                                                                                                                                                        6
SCC                                                                                                                                                                                                                                                                                                         4
Career Quest                                                                                                                                                                                                                                                                                                4
                                                                                                                                                                                                                                                                                                         ... 
Robert Morris College  RMC  RMU                                                                                                                                                                                                                                                                             1
BGU, Bethany College of Missions, BCOM, Bethany                                                                                                                                                                                                                                                             1
BBC  BBC&S  Summit U  Summit University  BBC & S  BBS  Baptist Bible College of PA  Baptist Bible College  Baptist Bible Seminary  Baptist Bible College & Seminary  Baptist Bible College of Pennsylvania  Summit University of Pennsylvania  SU of Pennsylvania  Summit U  Summit U of Pennsylvania       1
IMTI                                                                                                                                                                                                                                                                                                        1
The Mount|MWCC                                                                                                                                                                                                                                                                                              1
Name: IALIAS, Length: 2074, dtype: int64
CITY
New York           84
Chicago            71
Houston            64
Los Angeles        51
Brooklyn           49
                   ..
Macomb              1
Greencastle         1
San Anselmo         1
South Fallsburg     1
Leesport            1
Name: CITY, Length: 2368, dtype: int64
STABBR
CA    700
NY    439
TX    416
FL    356
PA    338
OH    283
IL    256
MI    168
NC    167
NJ    165
PR    155
TN    154
MA    152
MO    150
GA    148
VA    146
AZ    116
LA    115
IN    108
OK    104
WA    103
MN    100
CO     98
SC     96
WI     92
KY     88
AR     86
MD     82
AL     81
IA     80
OR     80
KS     75
WV     73
CT     71
UT     69
MS     56
NM     49
NE     43
ID     40
NV     39
NH     38
ME     37
MT     31
SD     29
ND     27
DC     25
HI     24
RI     22
VT     22
DE     19
AK     10
WY     10
GU      3
FM      1
VI      1
MH      1
MP      1
AS      1
PW      1
Name: STABBR, dtype: int64
ZIP
00961         9
90010         5
60605         5
78229         5
33144         5
             ..
64850         1
45551         1
93305-1299    1
11218-5611    1
68005-3098    1
Name: ZIP, Length: 5738, dtype: int64
CHFNM
                          87
Franklin K. Schoeneman    82
George Grayeb             40
Fardad Fateri             25
Mitch Charles             18
                          ..
Kathleen Rose              1
Rick Brewer                1
Elizabeth Fogle            1
Dr. Ellen Gambino          1
Timothy Hood               1
Name: CHFNM, Length: 5620, dtype: int64
CHFTITLE
President                           3240
Chancellor                           258
Director                             230
CEO                                  189
Owner                                172
                                    ... 
Chief Adminstrator Officer             1
Owner/CFO                              1
Director of CTE                        1
Academy Director                       1
Operation & Financial Aid Leader       1
Name: CHFTITLE, Length: 595, dtype: int64
OPEID
-2          31
 202500      3
 332900      3
 245300      2
 1303900     2
            ..
 4176000     1
 2526100     1
 175000      1
 293600      1
 4016500     1
Name: OPEID, Length: 6369, dtype: int64
SECTOR
2     1673
9     1504
4      930
1      806
6      588
3      367
7      235
5      147
0       72
8       66
99      52
Name: SECTOR, dtype: int64
HBCU
2    6338
1     102
Name: HBCU, dtype: int64
HLOFFER
 2    1629
 9    1256
 3    1110
 5     740
 7     737
 4     575
 1     176
 8     151
-3      52
 6      14
Name: HLOFFER, dtype: int64
UGOFFER
 1    6085
 2     303
-3      52
Name: UGOFFER, dtype: int64
In [44]:
figsize = (12, 6)

plt.figure(figsize = figsize)
sns.countplot(x = 'SECTOR', data = hd)
plt.xlabel("Sector of Institution")
plt.ylabel("Number of Institutions")
plt.title("Count of Institutions by Sector")
plt.show()

Reading Tables from a Webpage 😎¶

https://nces.ed.gov/programs/digest/d21/tables/dt21_303.10.asp?current=yes

In [45]:
df = pd.read_html('https://nces.ed.gov/programs/digest/d21/tables/dt21_303.10.asp?current=yes')
In [46]:
print(type(df))
<class 'list'>
In [47]:
print(df)
[                         0                                1  \
0  2021 Tables and Figures  All Years of Tables and Figures   

                                      2  
0  Most Recent Full Issue of the Digest  ,                0                      1
0  Previous Page  Download Excel (50KB),                0                                                  1
0  Table 303.10.  Total fall enrollment in degree-granting posts...,      Year Totalenrollment Attendance status                                 \
     Year Totalenrollment         Full-time Part-time     Percentpart-time   
     Year Totalenrollment         Full-time Part-time     Percentpart-time   
        1               2                 3         4 4.1                5   
0   19471         2338226               ---       --- NaN              ---   
1   19481         2403396               ---       --- NaN              ---   
2   19491         2444900               ---       --- NaN              ---   
3   19501         2281298               ---       --- NaN              ---   
4   19511         2101962               ---       --- NaN              ---   
..    ...             ...               ...       ...  ..              ...   
76  20265        20054000          12182000   7872000 NaN             39.3   
77  20275        20169000          12241000   7928000 NaN             39.3   
78  20285        20282000          12301000   7980000 NaN             39.3   
79  20295        20393000          12361000   8032000 NaN             39.4   
80  20305        20482000          12402000   8080000 NaN             39.4   

   Sex of student                         Control of institution           \
             Male    Female Percentfemale                 Public  Private   
             Male    Female Percentfemale                 Public    Total   
                6         7             8                      9       10   
0         1659249    678977          29.0                1152377  1185849   
1         1709367    694029          28.9                1185588  1217808   
2         1721572    723328          29.6                1207151  1237749   
3         1560392    720906          31.6                1139699  1141599   
4         1390740    711222          33.8                1037938  1064024   
..            ...       ...           ...                    ...      ...   
76        8497000  11557000          57.6               14801000  5253000   
77        8550000  11619000          57.6               14882000  5287000   
78        8603000  11679000          57.6               14960000  5322000   
79        8656000  11737000          57.6               15038000  5355000   
80        8700000  11783000          57.5               15101000  5381000   

                              
                              
   Nonprofit For-profit       
          11         12 12.1  
0        ---        ---  NaN  
1        ---        ---  NaN  
2        ---        ---  NaN  
3        ---        ---  NaN  
4        ---        ---  NaN  
..       ...        ...  ...  
76       ---        ---  NaN  
77       ---        ---  NaN  
78       ---        ---  NaN  
79       ---        ---  NaN  
80       ---        ---  NaN  

[81 rows x 14 columns],                                                   0   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  1   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  2   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  3   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  4   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  5   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  6   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  7   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  8   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  9   \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  10  \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  11  \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  12  \
0                                  ---Not available.   
1                   1 Degree-credit enrollment only.   
2  2 Includes part-time resident students and all...   
3  3 Large increases are due to the addition of s...   
4  4 Because of imputation techniques, data are n...   
5                                       5 Projected.   
6  NOTE: Data in this table represent the 50 stat...   
7  SOURCE: U.S. Department of Education, National...   

                                                  13  
0                                  ---Not available.  
1                   1 Degree-credit enrollment only.  
2  2 Includes part-time resident students and all...  
3  3 Large increases are due to the addition of s...  
4  4 Because of imputation techniques, data are n...  
5                                       5 Projected.  
6  NOTE: Data in this table represent the 50 stat...  
7  SOURCE: U.S. Department of Education, National...  ,                          0                                1  \
0  2021 Tables and Figures  All Years of Tables and Figures   

                                      2  
0  Most Recent Full Issue of the Digest  ,                0                      1
0  Previous Page  Download Excel (50KB)]
In [48]:
df[2]
Out[48]:
0 1
0 Table 303.10. Total fall enrollment in degree-granting posts...
In [49]:
df[3]
Out[49]:
Year Totalenrollment Attendance status Sex of student Control of institution
Year Totalenrollment Full-time Part-time Percentpart-time Male Female Percentfemale Public Private
Year Totalenrollment Full-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit
1 2 3 4 4.1 5 6 7 8 9 10 11 12 12.1
0 19471 2338226 --- --- NaN --- 1659249 678977 29.0 1152377 1185849 --- --- NaN
1 19481 2403396 --- --- NaN --- 1709367 694029 28.9 1185588 1217808 --- --- NaN
2 19491 2444900 --- --- NaN --- 1721572 723328 29.6 1207151 1237749 --- --- NaN
3 19501 2281298 --- --- NaN --- 1560392 720906 31.6 1139699 1141599 --- --- NaN
4 19511 2101962 --- --- NaN --- 1390740 711222 33.8 1037938 1064024 --- --- NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
76 20265 20054000 12182000 7872000 NaN 39.3 8497000 11557000 57.6 14801000 5253000 --- --- NaN
77 20275 20169000 12241000 7928000 NaN 39.3 8550000 11619000 57.6 14882000 5287000 --- --- NaN
78 20285 20282000 12301000 7980000 NaN 39.3 8603000 11679000 57.6 14960000 5322000 --- --- NaN
79 20295 20393000 12361000 8032000 NaN 39.4 8656000 11737000 57.6 15038000 5355000 --- --- NaN
80 20305 20482000 12402000 8080000 NaN 39.4 8700000 11783000 57.5 15101000 5381000 --- --- NaN

81 rows × 14 columns

In [50]:
enroll_df = df[3]
In [51]:
enroll_df.head()
# Found the data table -- what is up with the columns? 
Out[51]:
Year Totalenrollment Attendance status Sex of student Control of institution
Year Totalenrollment Full-time Part-time Percentpart-time Male Female Percentfemale Public Private
Year Totalenrollment Full-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit
1 2 3 4 4.1 5 6 7 8 9 10 11 12 12.1
0 19471 2338226 --- --- NaN --- 1659249 678977 29.0 1152377 1185849 --- --- NaN
1 19481 2403396 --- --- NaN --- 1709367 694029 28.9 1185588 1217808 --- --- NaN
2 19491 2444900 --- --- NaN --- 1721572 723328 29.6 1207151 1237749 --- --- NaN
3 19501 2281298 --- --- NaN --- 1560392 720906 31.6 1139699 1141599 --- --- NaN
4 19511 2101962 --- --- NaN --- 1390740 711222 33.8 1037938 1064024 --- --- NaN
In [52]:
enroll_df.columns
# It's a multiIndex which means its a list of tuples (lists)
Out[52]:
MultiIndex([(                  'Year',             'Year', ...),
            (       'Totalenrollment',  'Totalenrollment', ...),
            (     'Attendance status',        'Full-time', ...),
            (     'Attendance status',        'Part-time', ...),
            (     'Attendance status',        'Part-time', ...),
            (     'Attendance status', 'Percentpart-time', ...),
            (        'Sex of student',             'Male', ...),
            (        'Sex of student',           'Female', ...),
            (        'Sex of student',    'Percentfemale', ...),
            ('Control of institution',           'Public', ...),
            ('Control of institution',          'Private', ...),
            ('Control of institution',          'Private', ...),
            ('Control of institution',          'Private', ...),
            ('Control of institution',          'Private', ...)],
           )
In [53]:
enroll_df.columns[12][2]
# So we want the 3 value in the tuple or second index! 
Out[53]:
'For-profit'
In [54]:
# Let's spend some time here

enroll_df.columns = enroll_df.columns.map(lambda x: x[2])
enroll_df.columns
Out[54]:
Index(['Year', 'Totalenrollment', 'Full-time', 'Part-time', 'Part-time',
       'Percentpart-time', 'Male', 'Female', 'Percentfemale', 'Public',
       'Total', 'Nonprofit', 'For-profit', 'For-profit'],
      dtype='object')
In [55]:
enroll_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 81 entries, 0 to 80
Data columns (total 14 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Year              81 non-null     int64  
 1   Totalenrollment   81 non-null     int64  
 2   Full-time         81 non-null     object 
 3   Part-time         81 non-null     object 
 4   Part-time         7 non-null      float64
 5   Percentpart-time  81 non-null     object 
 6   Male              81 non-null     int64  
 7   Female            81 non-null     int64  
 8   Percentfemale     81 non-null     float64
 9   Public            81 non-null     int64  
 10  Total             81 non-null     int64  
 11  Nonprofit         81 non-null     object 
 12  For-profit        81 non-null     object 
 13  For-profit        5 non-null      float64
dtypes: float64(3), int64(6), object(5)
memory usage: 9.0+ KB
In [56]:
enroll_df.head()
# How are we going to clean up the year? 
Out[56]:
Year Totalenrollment Full-time Part-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit For-profit
0 19471 2338226 --- --- NaN --- 1659249 678977 29.0 1152377 1185849 --- --- NaN
1 19481 2403396 --- --- NaN --- 1709367 694029 28.9 1185588 1217808 --- --- NaN
2 19491 2444900 --- --- NaN --- 1721572 723328 29.6 1207151 1237749 --- --- NaN
3 19501 2281298 --- --- NaN --- 1560392 720906 31.6 1139699 1141599 --- --- NaN
4 19511 2101962 --- --- NaN --- 1390740 711222 33.8 1037938 1064024 --- --- NaN
In [57]:
enroll_df['year_clean'] = enroll_df['Year'].astype('str').str[:4]
enroll_df.head()
Out[57]:
Year Totalenrollment Full-time Part-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit For-profit year_clean
0 19471 2338226 --- --- NaN --- 1659249 678977 29.0 1152377 1185849 --- --- NaN 1947
1 19481 2403396 --- --- NaN --- 1709367 694029 28.9 1185588 1217808 --- --- NaN 1948
2 19491 2444900 --- --- NaN --- 1721572 723328 29.6 1207151 1237749 --- --- NaN 1949
3 19501 2281298 --- --- NaN --- 1560392 720906 31.6 1139699 1141599 --- --- NaN 1950
4 19511 2101962 --- --- NaN --- 1390740 711222 33.8 1037938 1064024 --- --- NaN 1951
In [58]:
enroll_df['year_clean'] = enroll_df['Year'].astype('str').str[:4]
enroll_df.head()
Out[58]:
Year Totalenrollment Full-time Part-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit For-profit year_clean
0 19471 2338226 --- --- NaN --- 1659249 678977 29.0 1152377 1185849 --- --- NaN 1947
1 19481 2403396 --- --- NaN --- 1709367 694029 28.9 1185588 1217808 --- --- NaN 1948
2 19491 2444900 --- --- NaN --- 1721572 723328 29.6 1207151 1237749 --- --- NaN 1949
3 19501 2281298 --- --- NaN --- 1560392 720906 31.6 1139699 1141599 --- --- NaN 1950
4 19511 2101962 --- --- NaN --- 1390740 711222 33.8 1037938 1064024 --- --- NaN 1951
In [59]:
years = np.arange(2017, 2028, 1) # Let's create a filter for recent years
recent_years = enroll_df[enroll_df['year_clean'].astype(int).isin(years)]
recent_years
Out[59]:
Year Totalenrollment Full-time Part-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit For-profit year_clean
67 2017 19778151 12076141 7702010 NaN 38.9 8571314 11206837 56.7 14571739 5206412 4108489 1097923 NaN 2017
68 2018 19651412 11989569 7661843 NaN 39.0 8444614 11206798 57.0 14539257 5112155 4131846 980309 NaN 2018
69 2019 19630178 11954413 7675765 NaN 39.1 8363889 11266289 57.4 14503647 5126531 4135372 991159 NaN 2019
70 2020 18991798 11591353 7400445 NaN 39.0 7869545 11122253 58.6 13867239 5124559 4101019 1023540 NaN 2020
71 20215 20327000 12387000 7941000 NaN 39.1 8685000 11643000 57.3 14975000 5352000 --- --- NaN 2021
72 20225 20031000 12177000 7854000 NaN 39.2 8524000 11506000 57.4 14769000 5261000 --- --- NaN 2022
73 20235 19851000 12041000 7810000 NaN 39.3 8422000 11429000 57.6 14650000 5201000 --- --- NaN 2023
74 20245 19862000 12041000 7821000 NaN 39.4 8416000 11446000 57.6 14664000 5198000 --- --- NaN 2024
75 20255 19934000 12099000 7835000 NaN 39.3 8444000 11490000 57.6 14716000 5218000 --- --- NaN 2025
76 20265 20054000 12182000 7872000 NaN 39.3 8497000 11557000 57.6 14801000 5253000 --- --- NaN 2026
77 20275 20169000 12241000 7928000 NaN 39.3 8550000 11619000 57.6 14882000 5287000 --- --- NaN 2027
In [60]:
years = np.arange(2017, 2028, 1) # Let's create a filter for recent years
recent_years = enroll_df[enroll_df['year_clean'].astype(int).isin(years)]
recent_years
Out[60]:
Year Totalenrollment Full-time Part-time Part-time Percentpart-time Male Female Percentfemale Public Total Nonprofit For-profit For-profit year_clean
67 2017 19778151 12076141 7702010 NaN 38.9 8571314 11206837 56.7 14571739 5206412 4108489 1097923 NaN 2017
68 2018 19651412 11989569 7661843 NaN 39.0 8444614 11206798 57.0 14539257 5112155 4131846 980309 NaN 2018
69 2019 19630178 11954413 7675765 NaN 39.1 8363889 11266289 57.4 14503647 5126531 4135372 991159 NaN 2019
70 2020 18991798 11591353 7400445 NaN 39.0 7869545 11122253 58.6 13867239 5124559 4101019 1023540 NaN 2020
71 20215 20327000 12387000 7941000 NaN 39.1 8685000 11643000 57.3 14975000 5352000 --- --- NaN 2021
72 20225 20031000 12177000 7854000 NaN 39.2 8524000 11506000 57.4 14769000 5261000 --- --- NaN 2022
73 20235 19851000 12041000 7810000 NaN 39.3 8422000 11429000 57.6 14650000 5201000 --- --- NaN 2023
74 20245 19862000 12041000 7821000 NaN 39.4 8416000 11446000 57.6 14664000 5198000 --- --- NaN 2024
75 20255 19934000 12099000 7835000 NaN 39.3 8444000 11490000 57.6 14716000 5218000 --- --- NaN 2025
76 20265 20054000 12182000 7872000 NaN 39.3 8497000 11557000 57.6 14801000 5253000 --- --- NaN 2026
77 20275 20169000 12241000 7928000 NaN 39.3 8550000 11619000 57.6 14882000 5287000 --- --- NaN 2027
In [61]:
import matplotlib.pyplot as plt
import seaborn as sns
In [62]:
plt.figure(figsize=(15,6))
sns.set_theme(style="whitegrid", context="talk")
sns.set_color_codes('colorblind')
sns.lineplot(x = 'year_clean', y= 'Totalenrollment', data = recent_years)
plt.xlabel("Year")
plt.ylabel("Total Enrollment")
plt.title("Enrollment and Projected Enrollment for Higher Education")
plt.show()
In [63]:
plt.figure(figsize=(15,6))
sns.set_theme(style="whitegrid", context="talk")
sns.set_color_codes('colorblind')
sns.lineplot(x = 'year_clean', y= 'Totalenrollment', data = recent_years)
plt.xlabel("Year")
plt.ylabel("Total Enrollment")
plt.title("Enrollment and Projected Enrollment for Higher Education")
plt.show()

Reading a PDF¶

Capture.PNG

In [65]:
from tabula.io import read_pdf
en = read_pdf('EFD2020.pdf', pages='all')
In [66]:
en # We have a list again
Out[66]:
[    UNITID XGRCOHR  GRCOHRT XUGENTER  UGENTER XPGRCOH  PGRCOHR XRRFTCT  \
 0   100654       R   1525.0        R   1712.0       R     89.0       R   
 1   100663       R   2102.0        R   3529.0       R     60.0       R   
 2   100690       A      NaN        A      NaN       A      NaN       R   
 3   100706       R   1328.0        R   2186.0       R     61.0       R   
 4   100724       R    926.0        R   1123.0       R     82.0       R   
 ..     ...     ...      ...      ...      ...     ...      ...     ...   
 66  103927       A      NaN        A      NaN       A      NaN       R   
 67  103945       A      NaN        A      NaN       A      NaN       R   
 68  103954       A      NaN        A      NaN       A      NaN       R   
 69  103963       A      NaN        A      NaN       A      NaN       R   
 70  104090       A      NaN        A      NaN       A      NaN       R   
 
     RRFTCT XRRFTEX  ...  XRRPTIN RRPTIN  XRRPTCTA RRPTCTA  XRET_NM.1 RET_NMP  \
 0   1688.0       R  ...        R    0.0         R     6.0          R     2.0   
 1   2294.0       R  ...        R    0.0         R    42.0          R    20.0   
 2      2.0       Z  ...        A    NaN         A     NaN          A     NaN   
 3   1489.0       R  ...        R    0.0         R     8.0          R     2.0   
 4   1000.0       R  ...        R    0.0         R    19.0          R     1.0   
 ..     ...     ...  ...      ...    ...       ...     ...        ...     ...   
 66    41.0       Z  ...        A    NaN         A     NaN          A     NaN   
 67     2.0       Z  ...        A    NaN         A     NaN          A     NaN   
 68     3.0       R  ...        A    NaN         A     NaN          A     NaN   
 69    99.0       R  ...        R    0.0         R     0.0          R     0.0   
 70    19.0       R  ...        R    0.0         R     0.0          R     0.0   
 
     XRET_PCP RET_PCP  XSTUFACR STUFACR  
 0          R    33.0         R      18  
 1          R    48.0         R      20  
 2          A     NaN         R      13  
 3          R    25.0         R      19  
 4          R     5.0         R      15  
 ..       ...     ...       ...     ...  
 66         A     NaN         R      25  
 67         A     NaN         R       4  
 68         A     NaN         R      20  
 69         A     NaN         R      10  
 70         A     NaN         R      24  
 
 [71 rows x 33 columns],
     104151  R   12427 R.1    17702 R.2    70 R.3   13837 R.4  ...  Z.1  0.1  \
 0   104160  R   393.0   R   2269.0   R  17.0   R   627.0   R  ...    R  0.0   
 1   104179  R  5374.0   R  10018.0   R  54.0   R  6036.0   R  ...    R  0.0   
 2   104346  R   324.0   R   2131.0   R  15.0   R   514.0   R  ...    R  0.0   
 3   104391  A     NaN   A      NaN   A   NaN   R    56.0   R  ...    A  NaN   
 4   104425  R   478.0   R    882.0   R  54.0   R   520.0   R  ...    R  0.0   
 ..     ... ..     ...  ..      ...  ..   ...  ..     ...  ..  ...  ...  ...   
 66  107220  A     NaN   A      NaN   A   NaN   R    91.0   R  ...    A  NaN   
 67  107293  A     NaN   A      NaN   A   NaN   R    17.0   R  ...    A  NaN   
 68  107318  R    74.0   R    511.0   R  14.0   R    87.0   R  ...    R  0.0   
 69  107327  R   193.0   R    467.0   R  41.0   R   260.0   R  ...    R  0.0   
 70  107442  A     NaN   A      NaN   A   NaN   R     4.0   R  ...    A  NaN   
 
     R.10     124  R.11      62  R.12    50  R.13  18  
 0      R  1223.0     R   501.0     R  41.0     R  17  
 1      R  1647.0     R  1101.0     R  67.0     R  15  
 2      R   862.0     R   274.0     R  32.0     R  15  
 3      A     NaN     A     NaN     A   NaN     R  18  
 4      R   197.0     R    53.0     R  27.0     R  17  
 ..   ...     ...   ...     ...   ...   ...   ...  ..  
 66     A     NaN     A     NaN     A   NaN     R   5  
 67     A     NaN     A     NaN     A   NaN     R  20  
 68     R    56.0     R    20.0     R  36.0     R   9  
 69     R    43.0     R     9.0     R  21.0     R  17  
 70     A     NaN     A     NaN     A   NaN     R  20  
 
 [71 rows x 33 columns],
     107460  R     247 R.1      388 R.2    64 R.3     351  Z  ...  Z.3  0.3  \
 0   107488  A     NaN   A      NaN   A   NaN   R   172.0  R  ...    R  0.0   
 1   107512  R   389.0   R    650.0   R  60.0   R   441.0  R  ...    Z  0.0   
 2   107521  R    52.0   R    138.0   R  38.0   R    67.0  R  ...    R  0.0   
 3   107549  R   149.0   R    384.0   R  39.0   R   162.0  Z  ...    Z  0.0   
 4   107558  R   195.0   R    241.0   R  81.0   R   209.0  R  ...    A  NaN   
 ..     ... ..     ...  ..      ...  ..   ...  ..     ... ..  ...  ...  ...   
 66  110653  R  5742.0   R   8510.0   R  67.0   R  6053.0  R  ...    R  0.0   
 67  110662  R  6378.0   R  10173.0   R  63.0   R  5908.0  R  ...    R  0.0   
 68  110671  R  4818.0   R   7007.0   R  69.0   R  4743.0  R  ...    R  0.0   
 69  110680  R  6423.0   R   9794.0   R  66.0   R  6011.0  R  ...    R  0.0   
 70  110705  R  4813.0   R   7330.0   R  66.0   R  4928.0  R  ...    R  0.0   
 
     R.8  38.1  R.9    21  R.10    55  R.11  12  
 0     R   4.0    R   2.0     R  50.0     R   6  
 1     R   0.0    R   0.0     A   NaN     R  12  
 2     R  26.0    R   5.0     R  19.0     R  14  
 3     R  40.0    R  13.0     R  33.0     R  14  
 4     A   NaN    A   NaN     A   NaN     R  16  
 ..  ...   ...  ...   ...   ...   ...   ...  ..  
 66    R  16.0    R  10.0     R  63.0     R  18  
 67    R  11.0    R   8.0     R  73.0     R  18  
 68    R  26.0    R  18.0     R  69.0     R  22  
 69    R  12.0    R   6.0     R  50.0     R  19  
 70    R   7.0    R   5.0     R  71.0     R  18  
 
 [71 rows x 33 columns],
     110714  R    4158 R.1    5853 R.2    71 R.3    3700 R.4  ...  R.11  0.3  \
 0   110778  A     NaN   A     NaN   A   NaN   A     NaN   A  ...     A  NaN   
 1   110875  A     NaN   A     NaN   A   NaN   R    10.0   Z  ...     Z  0.0   
 2   110918  R     1.0   R     7.0   R  14.0   R     1.0   R  ...     A  NaN   
 3   111045  R     2.0   R     9.0   R  22.0   R     0.0   R  ...     R  0.0   
 4   111054  A     NaN   A     NaN   A   NaN   R    16.0   R  ...     A  NaN   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  115010  A     NaN   A     NaN   A   NaN   R     0.0   R  ...     A  NaN   
 67  115083  A     NaN   A     NaN   A   NaN   A     NaN   A  ...     A  NaN   
 68  115126  R  1088.0   R  4298.0   R  25.0   R  1374.0   R  ...     R  0.0   
 69  115296  R  1332.0   R  4236.0   R  31.0   R  1693.0   R  ...     R  0.0   
 70  115357  A     NaN   A     NaN   A   NaN   R    16.0   Z  ...     A  NaN   
 
     R.12    13.1  R.13      7  R.14    54  R.15  25  
 0      A     NaN     A    NaN     A   NaN     R   9  
 1      R    20.0     R    8.0     R  40.0     R  11  
 2      A     NaN     A    NaN     A   NaN     R   2  
 3      R     0.0     R    0.0     A   NaN     R  11  
 4      A     NaN     A    NaN     A   NaN     R  11  
 ..   ...     ...   ...    ...   ...   ...   ...  ..  
 66     A     NaN     A    NaN     A   NaN     R  15  
 67     A     NaN     A    NaN     A   NaN     R  12  
 68     R   756.0     R  265.0     R  35.0     R  29  
 69     R  1000.0     R  392.0     R  39.0     R  24  
 70     A     NaN     A    NaN     A   NaN     R  15  
 
 [71 rows x 33 columns],
     115393  R    987 R.1    2580 R.2    38 R.3   1024 R.4  ...  R.11  0.3  \
 0   115409  R  207.0   R   235.0   R  88.0   R  224.0   R  ...     A  NaN   
 1   115658  A    NaN   A     NaN   A   NaN   R    8.0   Z  ...     Z  0.0   
 2   115728  R  132.0   R   221.0   R  60.0   R  161.0   R  ...     R  0.0   
 3   115755  R  532.0   R  1536.0   R  35.0   R  815.0   Z  ...     Z  0.0   
 4   115773  R    6.0   R    40.0   R  15.0   R    1.0   R  ...     R  0.0   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ...   ...  ...   
 66  119845  A    NaN   A     NaN   A   NaN   R   11.0   Z  ...     Z  0.0   
 67  119951  A    NaN   A     NaN   A   NaN   R   62.0   R  ...     A  NaN   
 68  120069  A    NaN   A     NaN   A   NaN   R   67.0   R  ...     A  NaN   
 69  120078  A    NaN   A     NaN   A   NaN   R   54.0   R  ...     A  NaN   
 70  120087  A    NaN   A     NaN   A   NaN   R   45.0   R  ...     A  NaN   
 
     R.12 1450.1  R.13  194  R.14     13  R.15    24  
 0      A    NaN     A  NaN     A    NaN     R   8.0  
 1      R    9.0     R  5.0     R   56.0     R  15.0  
 2      R    1.0     R  1.0     R  100.0     R   7.0  
 3      R    8.0     R  5.0     R   63.0     R  17.0  
 4      R    0.0     R  0.0     A    NaN     R  12.0  
 ..   ...    ...   ...  ...   ...    ...   ...   ...  
 66     R    1.0     R  1.0     R  100.0     R  15.0  
 67     A    NaN     A  NaN     A    NaN     R  31.0  
 68     A    NaN     A  NaN     A    NaN     R  15.0  
 69     A    NaN     A  NaN     A    NaN     R  33.0  
 70     A    NaN     A  NaN     A    NaN     R  16.0  
 
 [71 rows x 33 columns],
     120166  R       1 R.1     1.1 R.2   100 R.3       0 R.4  ...  R.10  0.7  \
 0   120184  A     NaN   A     NaN   A   NaN   A     NaN   A  ...     A  NaN   
 1   120254  R   402.0   R   433.0   R  93.0   R   562.0   R  ...     A  NaN   
 2   120290  R   710.0   R  2523.0   R  28.0   R   777.0   R  ...     R  0.0   
 3   120342  R  2101.0   R  5734.0   R  37.0   R  2376.0   R  ...     R  0.0   
 4   120403  R   224.0   R   321.0   R  70.0   R   274.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  123493  A     NaN   A     NaN   A   NaN   R    44.0   R  ...     A  NaN   
 67  123509  R   318.0   R  3307.0   R  10.0   R   575.0   R  ...     R  0.0   
 68  123527  R   942.0   R  3400.0   R  28.0   R  1587.0   R  ...     R  0.0   
 69  123554  R   468.0   R   649.0   R  72.0   R   516.0   R  ...     A  NaN   
 70  123563  R   684.0   R  2511.0   R  27.0   R     0.0   R  ...     R  0.0   
 
     R.11     0.8  R.12    0.9  A.1 Unnamed: 1  R.13     2  
 0      A     NaN     A    NaN    A        NaN     R   8.0  
 1      A     NaN     A    NaN    A        NaN     R   8.0  
 2      R   450.0     R  221.0    R       49.0     R  26.0  
 3      R  1285.0     R  521.0    R       41.0     R  27.0  
 4      R     0.0     R    0.0    A        NaN     R   7.0  
 ..   ...     ...   ...    ...  ...        ...   ...   ...  
 66     A     NaN     A    NaN    A        NaN     R  11.0  
 67     R   466.0     R  154.0    R       33.0     R  26.0  
 68     R  1009.0     R  355.0    R       35.0     R  21.0  
 69     A     NaN     A    NaN    A        NaN     R   9.0  
 70     R     0.0     R    0.0    A        NaN     R  25.0  
 
 [71 rows x 33 columns],
     123572  R     877 R.1    1633 R.2    54 R.3  1531  Z  ...  Z.3  0.3  R.8  \
 0   123642  A     NaN   A     NaN   A   NaN   R     8  R  ...    R  0.0    R   
 1   123651  R   551.0   R   687.0   R  80.0   R   481  R  ...    R  0.0    R   
 2   123679  A     NaN   A     NaN   A   NaN   R   122  Z  ...    Z  0.0    R   
 3   123800  R  1492.0   R  4503.0   R  33.0   R  2089  R  ...    R  0.0    R   
 4   123952  R    27.0   R    64.0   R  42.0   R    30  R  ...    A  NaN    A   
 ..     ... ..     ...  ..     ...  ..   ...  ..   ... ..  ...  ...  ...  ...   
 66  127918  R   481.0   R   739.0   R  65.0   R   481  R  ...    A  NaN    A   
 67  127945  R    75.0   R   351.0   R  21.0   R    80  R  ...    R  0.0    R   
 68  127954  A     NaN   A     NaN   A   NaN   R     0  R  ...    A  NaN    A   
 69  128106  R   538.0   R   987.0   R  55.0   R   609  R  ...    R  0.0    R   
 70  128151  A     NaN   A     NaN   A   NaN   R   321  R  ...    R  0.0    R   
 
       20.1  R.9      9  R.10     45  R.11  21  
 0      3.0    R    3.0     R  100.0     R  15  
 1      4.0    R    4.0     R  100.0     R  16  
 2      0.0    R    0.0     A    NaN     R  18  
 3   1187.0    R  513.0     R   43.0     R  24  
 4      NaN    A    NaN     A    NaN     R  15  
 ..     ...  ...    ...   ...    ...   ...  ..  
 66     NaN    A    NaN     A    NaN     R  11  
 67    50.0    R   11.0     R   22.0     R  10  
 68     NaN    A    NaN     A    NaN     R  15  
 69    25.0    R   18.0     R   72.0     R  15  
 70    12.0    R    9.0     R   75.0     R  20  
 
 [71 rows x 33 columns],
     128179  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      56 R.1  ...  \
 0   128188  A         NaN   A         NaN   A         NaN  R   107.0   R  ...   
 1   128258  R       262.0   R       407.0   R        64.0  R   255.0   R  ...   
 2   128328  R      1132.0   R      1132.0   R       100.0  R  1132.0   R  ...   
 3   128337  A         NaN   A         NaN   A         NaN  R    36.0   R  ...   
 4   128391  R       413.0   R      1537.0   R        27.0  R   450.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  131520  R      2358.0   R      2627.0   R        90.0  R  1920.0   R  ...   
 67  131803  A         NaN   A         NaN   A         NaN  R     0.0   R  ...   
 68  131830  A         NaN   A         NaN   A         NaN  R     3.0   R  ...   
 69  131876  R       221.0   R       386.0   R        57.0  R   233.0   R  ...   
 70  132374  A         NaN   A         NaN   A         NaN  R   962.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6  22  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 1     R        0.0    R       71.0    R       36.0    R       51.0    R  11  
 2     A        NaN    A        NaN    A        NaN    A        NaN    R   6  
 3     Z        0.0    R       16.0    R        8.0    R       50.0    R  19  
 4     R        0.0    R        4.0    R        2.0    R       50.0    R  18  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    Z        0.0    R        5.0    R        1.0    R       20.0    R  12  
 67    R        0.0    R       47.0    R       20.0    R       43.0    R  28  
 68    A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 69    Z        0.0    R       12.0    R        3.0    R       25.0    R  11  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  12  
 
 [71 rows x 33 columns],
     132408  R      26 R.1      169 R.2    15 R.3      20 R.4  ...  R.11  0.3  \
 0   132471  R   636.0   R   1187.0   R  54.0   R   709.0   R  ...     R  0.0   
 1   132602  R   677.0   R    752.0   R  90.0   R   559.0   R  ...     R  0.0   
 2   132657  R   669.0   R    841.0   R  80.0   R   725.0   R  ...     R  0.0   
 3   132675  A     NaN   A      NaN   A   NaN   R    87.0   R  ...     R  0.0   
 4   132693  R  1111.0   R   2379.0   R  47.0   R     0.0   R  ...     R  0.0   
 ..     ... ..     ...  ..      ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  135647  A     NaN   A      NaN   A   NaN   R   164.0   R  ...     R  0.0   
 67  135717  R  6469.0   R  10319.0   R  63.0   R     0.0   R  ...     Z  0.0   
 68  135726  R  2339.0   R   3013.0   R  78.0   R  2180.0   R  ...     R  0.0   
 69  135735  A     NaN   A      NaN   A   NaN   R   261.0   Z  ...     Z  0.0   
 70  136109  A     NaN   A      NaN   A   NaN   R    16.0   Z  ...     Z  0.0   
 
     R.12    5.1  R.13      4  R.14     80  R.15   8  
 0      R    8.0     R    1.0     R   13.0     R  15  
 1      R    1.0     R    0.0     R    0.0     R  15  
 2      R   14.0     R    6.0     R   43.0     R  17  
 3      R   60.0     R    6.0     R   10.0     R   6  
 4      R    0.0     R    0.0     A    NaN     R  17  
 ..   ...    ...   ...    ...   ...    ...   ...  ..  
 66     R  127.0     R  105.0     R   83.0     R  15  
 67     R    0.0     R    0.0     A    NaN     R  19  
 68     R   23.0     R   18.0     R   78.0     R  12  
 69     R  149.0     R  111.0     R   74.0     R  17  
 70     R    8.0     R    8.0     R  100.0     R  15  
 
 [71 rows x 33 columns],
     136145  R      77 R.1     253 R.2    30 R.3     0 R.4  ...  Z.1  0.7  R.9  \
 0   136172  R  2613.0   R  4125.0   R  63.0   R  2458   R  ...    R  0.0    R   
 1   136215  R  1674.0   R  2244.0   R  75.0   R  1510   R  ...    R  0.0    R   
 2   136233  R   460.0   R   956.0   R  48.0   R     0   R  ...    Z  0.0    R   
 3   136303  A     NaN   A     NaN   A   NaN   R    98   R  ...    R  0.0    R   
 4   136330  R   516.0   R  1169.0   R  44.0   R   514   R  ...    R  0.0    R   
 ..     ... ..     ...  ..     ...  ..   ...  ..   ...  ..  ...  ...  ...  ...   
 66  139153  R    47.0   R   209.0   R  22.0   R    10   R  ...    R  0.0    R   
 67  139199  R   160.0   R   625.0   R  26.0   R   171   Z  ...    Z  0.0    R   
 68  139205  R   224.0   R   453.0   R  49.0   R   192   R  ...    R  0.0    R   
 69  139250  R   502.0   R  1099.0   R  46.0   R   392   R  ...    R  0.0    R   
 70  139278  R   332.0   R  2059.0   R  16.0   R   453   R  ...    Z  0.0    R   
 
       0.8  R.10    0.9  A.1 Unnamed: 1  R.11  19  
 0   183.0     R  103.0    R       56.0     R  19  
 1    16.0     R    8.0    R       50.0     R  17  
 2     0.0     R    0.0    A        NaN     R  23  
 3    31.0     R    3.0    R       10.0     R   8  
 4     6.0     R    3.0    R       50.0     R  12  
 ..    ...   ...    ...  ...        ...   ...  ..  
 66   16.0     R    8.0    R       50.0     R  11  
 67   12.0     R    3.0    R       25.0     R  11  
 68    3.0     R    1.0    R       33.0     R  16  
 69   46.0     R   24.0    R       52.0     R  21  
 70  804.0     R  369.0    R       46.0     R  20  
 
 [71 rows x 33 columns],
     139311  R     690 R.1    2226 R.2    31 R.3     481 R.4  ...  R.11  0.3  \
 0   139357  R   168.0   R   964.0   R  17.0   R   226.0   R  ...     Z  0.0   
 1   139366  R  1229.0   R  2304.0   R  53.0   R   919.0   R  ...     R  0.0   
 2   139384  R   443.0   R  2412.0   R  18.0   R   468.0   R  ...     Z  0.0   
 3   139393  R   196.0   R   240.0   R  82.0   R   219.0   R  ...     A  NaN   
 4   139463  R   700.0   R  1354.0   R  52.0   R   534.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  142115  R  2753.0   R  7002.0   R  39.0   R  2931.0   R  ...     Z  0.0   
 67  142179  R   151.0   R   439.0   R  34.0   R   175.0   Z  ...     Z  0.0   
 68  142276  R  1233.0   R  3372.0   R  37.0   R  1073.0   R  ...     Z  0.0   
 69  142285  R  1394.0   R  2851.0   R  49.0   R  1456.0   R  ...     Z  0.0   
 70  142294  R   265.0   R   311.0   R  85.0   R   360.0   R  ...     Z  0.0   
 
     R.12   84.1  R.13     48  R.14     57  R.15  19  
 0      R  369.0     R  151.0     R   41.0     R  15  
 1      R   35.0     R   17.0     R   49.0     R  17  
 2      R  505.0     R  262.0     R   52.0     R  22  
 3      A    NaN     A    NaN     A    NaN     R  12  
 4      R   62.0     R   29.0     R   47.0     R  19  
 ..   ...    ...   ...    ...   ...    ...   ...  ..  
 66     R   92.0     R   36.0     R   39.0     R  18  
 67     R  114.0     R   59.0     R   52.0     R  11  
 68     R   69.0     R   24.0     R   35.0     R  13  
 69     R   19.0     R    7.0     R   37.0     R  16  
 70     R    1.0     R    1.0     R  100.0     R  11  
 
 [71 rows x 33 columns],
     142328  R     453 R.1     742 R.2    61 R.3     335 R.4  ...  R.11  0.3  \
 0   142407  A     NaN   A     NaN   A   NaN   R     7.0   Z  ...     Z  0.0   
 1   142416  A     NaN   A     NaN   A   NaN   R     9.0   R  ...     A  NaN   
 2   142443  R   594.0   R  1785.0   R  33.0   R   645.0   R  ...     R  0.0   
 3   142461  R   282.0   R   392.0   R  72.0   R   278.0   Z  ...     A  NaN   
 4   142489  A     NaN   A     NaN   A   NaN   R    18.0   R  ...     A  NaN   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  145707  R   200.0   R   479.0   R  42.0   R   227.0   R  ...     R  0.0   
 67  145725  R   513.0   R   714.0   R  72.0   R   583.0   R  ...     R  0.0   
 68  145813  R  3324.0   R  5056.0   R  66.0   R  3835.0   R  ...     R  0.0   
 69  145831  R   404.0   R  1008.0   R  40.0   R   410.0   R  ...     R  0.0   
 70  146205  R   502.0   R  1586.0   R  32.0   R   515.0   R  ...     R  0.0   
 
     R.12   23.1  R.13     5  R.14    22  R.15  15  
 0      R    2.0     R   1.0     R  50.0     R   9  
 1      A    NaN     A   NaN     A   NaN     R  20  
 2      R  243.0     R  86.0     R  35.0     R  13  
 3      A    NaN     A   NaN     A   NaN     R  18  
 4      A    NaN     A   NaN     A   NaN     R  13  
 ..   ...    ...   ...   ...   ...   ...   ...  ..  
 66     R   20.0     R   9.0     R  45.0     R  11  
 67     R    0.0     R   0.0     A   NaN     R  10  
 68     R   25.0     R  19.0     R  76.0     R  19  
 69     R  146.0     R  77.0     R  53.0     R  14  
 70     R  156.0     R  62.0     R  40.0     R  20  
 
 [71 rows x 33 columns],
     146278  R     333 R.1     924 R.2     36 R.3     391 R.4  ...  R.11  0.2  \
 0   146296  R   689.0   R  2014.0   R   34.0   R   881.0   R  ...     R  0.0   
 1   146339  R   154.0   R   322.0   R   48.0   R   157.0   R  ...     R  0.0   
 2   146348  R   263.0   R   743.0   R   35.0   R   280.0   R  ...     R  0.0   
 3   146366  R   305.0   R  1745.0   R   17.0   R   239.0   R  ...     R  0.0   
 4   146418  R   319.0   R  1181.0   R   27.0   R   387.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..    ...  ..     ...  ..  ...   ...  ...   
 66  149222  R  1322.0   R  2646.0   R   50.0   R   928.0   R  ...     R  0.0   
 67  149231  R  1509.0   R  2729.0   R   55.0   R  1645.0   R  ...     R  0.0   
 68  149310  A     NaN   A     NaN   A    NaN   R    38.0   R  ...     A  NaN   
 69  149329  R    22.0   R    22.0   R  100.0   R    19.0   R  ...     A  NaN   
 70  149365  R   157.0   R  1260.0   R   12.0   R   203.0   R  ...     R  0.0   
 
     R.12    128  R.13     66  R.14    52  R.15  13  
 0      R  579.0     R  233.0     R  40.0     R  15  
 1      R    0.0     R    0.0     A   NaN     R  10  
 2      R  142.0     R   68.0     R  48.0     R  12  
 3      R   60.0     R   25.0     R  42.0     R  17  
 4      R  129.0     R   39.0     R  30.0     R  16  
 ..   ...    ...   ...    ...   ...   ...   ...  ..  
 66     R   12.0     R    3.0     R  25.0     R  11  
 67     R   22.0     R    7.0     R  32.0     R  16  
 68     A    NaN     A    NaN     A   NaN     R   8  
 69     A    NaN     A    NaN     A   NaN     R  13  
 70     R  173.0     R   59.0     R  34.0     R  12  
 
 [71 rows x 33 columns],
     149499  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      7 R.1  ...  \
 0   149505  R       178.0   R       294.0   R        61.0  R  160.0   R  ...   
 1   149514  R       100.0   R       192.0   R        52.0  R  102.0   R  ...   
 2   149532  R       759.0   R      2826.0   R        27.0  R  827.0   R  ...   
 3   149550  A         NaN   A         NaN   A         NaN  R   35.0   R  ...   
 4   149639  R         9.0   R        18.0   R        50.0  R   15.0   Z  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  152600  R       635.0   R       783.0   R        81.0  R  646.0   R  ...   
 67  152628  A         NaN   A         NaN   A         NaN  R   14.0   R  ...   
 68  152637  R      1278.0   R      8416.0   R        15.0  R   25.0   R  ...   
 69  152673  R       240.0   R       249.0   R        96.0  R  229.0   R  ...   
 70  152798  A         NaN   A         NaN   A         NaN  R    0.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6 7.2  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 1     R        0.0    R        2.0    R        0.0    R        0.0    R   8  
 2     R        0.0    R      347.0    R      126.0    R       36.0    R  18  
 3     R        0.0    R        7.0    R        5.0    R       71.0    R  11  
 4     A        NaN    A        NaN    A        NaN    A        NaN    R   4  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    R        0.0    R        0.0    R        0.0    A        NaN    R  11  
 67    R        0.0    R        5.0    R        2.0    R       40.0    R  10  
 68    R        0.0    R        3.0    R        2.0    R       67.0    R  20  
 69    A        NaN    A        NaN    A        NaN    A        NaN    R   9  
 70    R        0.0    R        0.0    R        0.0    A        NaN    R  11  
 
 [71 rows x 33 columns],
     152992  R    172 R.1     364 R.2    47 R.3    136 R.4  ...  A.2  \
 0   153001  R  229.0   R   427.0   R  54.0   R  230.0   R  ...    R   
 1   153074  A    NaN   A     NaN   A   NaN   R   19.0   R  ...    A   
 2   153083  A    NaN   A     NaN   A   NaN   R   17.0   R  ...    A   
 3   153108  R  324.0   R   377.0   R  86.0   R  337.0   R  ...    R   
 4   153126  R  146.0   R   210.0   R  70.0   R  134.0   R  ...    R   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ...  ...   
 66  154590  R  331.0   R   491.0   R  67.0   R  363.0   R  ...    R   
 67  154642  R  331.0   R  1020.0   R  32.0   R  351.0   R  ...    R   
 68  154688  R  219.0   R   334.0   R  66.0   R  244.0   R  ...    R   
 69  154697  R  408.0   R  1621.0   R  25.0   R  397.0   R  ...    R   
 70  154712  R  575.0   R   772.0   R  74.0   R  524.0   R  ...    R   
 
    Unnamed: 2  A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9  11  
 0         0.0    R        0.0    R        0.0    A        NaN    R  10  
 1         NaN    A        NaN    A        NaN    A        NaN    R   9  
 2         NaN    A        NaN    A        NaN    A        NaN    R  10  
 3         0.0    R        0.0    R        0.0    A        NaN    R  11  
 4         0.0    R        0.0    R        0.0    A        NaN    R   8  
 ..        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66        0.0    R        0.0    R        0.0    A        NaN    R  15  
 67        0.0    R      116.0    R       34.0    R       29.0    R  20  
 68        0.0    R        1.0    R        1.0    R      100.0    R  10  
 69        0.0    R      196.0    R       45.0    R       23.0    R  18  
 70        0.0    R        0.0    R        0.0    A        NaN    R  13  
 
 [71 rows x 33 columns],
     154721  R     221 R.1     325 R.2    68 R.3     245 R.4  ...  R.11  0.4  \
 0   154749  R   151.0   R   207.0   R  73.0   R   155.0   R  ...     R  0.0   
 1   154800  R  1178.0   R  2313.0   R  51.0   R  1259.0   R  ...     R  0.0   
 2   154855  A     NaN   A     NaN   A   NaN   R    85.0   R  ...     R  0.0   
 3   154907  R   168.0   R   630.0   R  27.0   R   375.0   Z  ...     R  0.0   
 4   154925  R   385.0   R   665.0   R  58.0   R   469.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  156417  R     7.0   R    16.0   R  44.0   R    17.0   R  ...     R  0.0   
 67  156426  A     NaN   A     NaN   A   NaN   R     8.0   Z  ...     Z  0.0   
 68  156471  A     NaN   A     NaN   A   NaN   R    27.0   R  ...     R  0.0   
 69  156541  R   590.0   R  1514.0   R  39.0   R   741.0   R  ...     R  0.0   
 70  156620  R  2165.0   R  3763.0   R  58.0   R  1905.0   R  ...     Z  0.0   
 
     R.12    0.5  R.13    0.6  A Unnamed: 0  R.14  14  
 0      R    0.0     R    0.0  A        NaN     R  10  
 1      R  487.0     R  166.0  R       34.0     R  17  
 2      R    1.0     R    1.0  R      100.0     R  17  
 3      R   23.0     R    8.0  R       35.0     R  14  
 4      R   12.0     R    6.0  R       50.0     R  21  
 ..   ...    ...   ...    ... ..        ...   ...  ..  
 66     R    1.0     R    1.0  R      100.0     R   9  
 67     R   27.0     R   23.0  R       85.0     R  20  
 68     R    1.0     R    0.0  R        0.0     R  14  
 69     R   15.0     R    3.0  R       20.0     R  20  
 70     R   31.0     R   14.0  R       45.0     R  15  
 
 [71 rows x 33 columns],
     156648  R     639 R.1    1389 R.2    46 R.3     767 R.4  ...  R.11  0.2  \
 0   156745  R   426.0   R   482.0   R  88.0   R   314.0   R  ...     R  0.0   
 1   156754  A     NaN   A     NaN   A   NaN   R     2.0   R  ...     A  NaN   
 2   156790  R   258.0   R   436.0   R  59.0   R   350.0   R  ...     R  0.0   
 3   156842  A     NaN   A     NaN   A   NaN   R     5.0   Z  ...     Z  0.0   
 4   156851  R   151.0   R   251.0   R  60.0   R   186.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  159647  R  1858.0   R  3474.0   R  53.0   R  1967.0   R  ...     R  0.0   
 67  159656  R   765.0   R   975.0   R  78.0   R   822.0   R  ...     R  0.0   
 68  159717  R  1194.0   R  2191.0   R  54.0   R  1202.0   R  ...     R  0.0   
 69  159939  R  1078.0   R  2387.0   R  45.0   R   926.0   R  ...     R  0.0   
 70  159948  R    18.0   R   184.0   R  10.0   R    18.0   R  ...     R  0.0   
 
     R.12 325.1  R.13   116  R.14    36  R.15  22  
 0      R   0.0     R   0.0     A   NaN     R  12  
 1      A   NaN     A   NaN     A   NaN     R  13  
 2      R  54.0     R  34.0     R  63.0     R  15  
 3      R  42.0     R  33.0     R  79.0     R  17  
 4      R  66.0     R  30.0     R  45.0     R  15  
 ..   ...   ...   ...   ...   ...   ...   ...  ..  
 66     R   6.0     R   1.0     R  17.0     R  23  
 67     R   8.0     R   3.0     R  38.0     R  13  
 68     R  13.0     R   6.0     R  46.0     R  21  
 69     R   9.0     R   3.0     R  33.0     R  20  
 70     R  51.0     R  15.0     R  29.0     R  15  
 
 [71 rows x 33 columns],
     159966  R    1189 R.1    1847 R.2    64 R.3  1169 R.4  ...  R.11  0.2  \
 0   159993  R  1123.0   R  2371.0   R  47.0   R  1123   R  ...     R  0.0   
 1   160010  R   125.0   R   357.0   R  35.0   R   132   R  ...     R  0.0   
 2   160038  R  1401.0   R  3523.0   R  40.0   R  1312   R  ...     R  0.0   
 3   160065  R    52.0   R   308.0   R  17.0   R    61   R  ...     R  0.0   
 4   160074  R    82.0   R   299.0   R  27.0   R    59   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..   ...  ..  ...   ...  ...   
 66  162557  R   664.0   R  1377.0   R  48.0   R   785   R  ...     R  0.0   
 67  162584  R   627.0   R   980.0   R  64.0   R   739   R  ...     Z  0.0   
 68  162609  R   145.0   R   284.0   R  51.0   R   197   R  ...     R  0.0   
 69  162654  R   232.0   R   256.0   R  91.0   R   338   R  ...     R  0.0   
 70  162690  R   440.0   R  1188.0   R  37.0   R   534   R  ...     R  0.0   
 
     R.12    6.1  R.13      3  R.14    50  R.15  19  
 0      R   20.0     R   11.0     R  55.0     R  18  
 1      R  130.0     R   49.0     R  38.0     R   8  
 2      R   67.0     R   27.0     R  40.0     R  19  
 3      R    1.0     R    0.0     R   0.0     R  10  
 4      R    2.0     R    1.0     R  50.0     R  10  
 ..   ...    ...   ...    ...   ...   ...   ...  ..  
 66     R  343.0     R  154.0     R  45.0     R  17  
 67     R    1.0     R    0.0     R   0.0     R  14  
 68     R    5.0     R    3.0     R  60.0     R  13  
 69     R    0.0     Z    0.0     A   NaN     R   9  
 70     R  284.0     R  131.0     R  46.0     R  18  
 
 [71 rows x 33 columns],
     162706  R     696 R.1    1312 R.2    53 R.3     736 R.4  ...  R.11  0.3  \
 0   162760  R   293.0   R   387.0   R  76.0   R   320.0   R  ...     R  0.0   
 1   162779  R   751.0   R  1917.0   R  39.0   R   962.0   R  ...     R  0.0   
 2   162830  A     NaN   A     NaN   A   NaN   R    41.0   Z  ...     A  NaN   
 3   162928  R  1404.0   R  1759.0   R  80.0   R  1475.0   R  ...     A  NaN   
 4   163028  A     NaN   A     NaN   A   NaN   R   162.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  165644  R   123.0   R   169.0   R  73.0   R   131.0   R  ...     Z  0.0   
 67  165662  R   938.0   R  1149.0   R  82.0   R   940.0   Z  ...     Z  0.0   
 68  165671  R   419.0   R   473.0   R  89.0   R   582.0   R  ...     A  NaN   
 69  165699  R   769.0   R   916.0   R  84.0   R   846.0   R  ...     R  0.0   
 70  165750  A     NaN   A     NaN   A   NaN   R     3.0   R  ...     A  NaN   
 
     R.12  322.1  R.13    138  R.14     43  R.15  17  
 0      R    0.0     Z    0.0     A    NaN     R  11  
 1      R  661.0     R  297.0     R   45.0     R  15  
 2      A    NaN     A    NaN     A    NaN     R  15  
 3      A    NaN     A    NaN     A    NaN     R   6  
 4      R   14.0     R   13.0     R   93.0     R  20  
 ..   ...    ...   ...    ...   ...    ...   ...  ..  
 66     R    0.0     Z    0.0     A    NaN     R  10  
 67     R    2.0     R    2.0     R  100.0     R  13  
 68     A    NaN     A    NaN     A    NaN     R  13  
 69     R    0.0     R    0.0     A    NaN     R  13  
 70     A    NaN     A    NaN     A    NaN     R  10  
 
 [71 rows x 33 columns],
     165802  R     183 R.1     331 R.2    55 R.3     157  Z  ...  Z.3  0.4  \
 0   165820  R   652.0   R  1039.0   R  63.0   R   671.0  R  ...    R  0.0   
 1   165866  R   608.0   R   959.0   R  63.0   R   772.0  R  ...    R  0.0   
 2   165884  R   112.0   R   216.0   R  52.0   R     5.0  R  ...    R  0.0   
 3   165936  R   373.0   R   404.0   R  92.0   R   355.0  R  ...    R  0.0   
 4   165981  R   106.0   R   660.0   R  16.0   R   130.0  R  ...    R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ... ..  ...  ...  ...   
 66  168421  R  1294.0   R  1352.0   R  96.0   R  1199.0  R  ...    A  NaN   
 67  168430  R   799.0   R  1184.0   R  67.0   R   878.0  R  ...    R  0.0   
 68  168528  R   535.0   R   611.0   R  88.0   R   540.0  R  ...    R  0.0   
 69  168546  R   476.0   R   523.0   R  91.0   R   415.0  R  ...    Z  0.0   
 70  168555  A     NaN   A     NaN   A   NaN   R    14.0  R  ...    R  0.0   
 
     R.8    0.5  Z.4   0.6  A Unnamed: 0  R.9  11  
 0     R    2.0    R   1.0  R       50.0    R  14  
 1     R    4.0    R   0.0  R        0.0    R  13  
 2     R    1.0    R   0.0  R        0.0    R   9  
 3     R    3.0    R   1.0  R       33.0    R  10  
 4     R  169.0    R  79.0  R       47.0    R  12  
 ..  ...    ...  ...   ... ..        ...  ...  ..  
 66    A    NaN    A   NaN  A        NaN    R  13  
 67    R    8.0    R   4.0  R       50.0    R  16  
 68    R    1.0    R   0.0  R        0.0    R  14  
 69    R    0.0    Z   0.0  A        NaN    R  12  
 70    R    3.0    R   2.0  R       67.0    R   7  
 
 [71 rows x 33 columns],
     168573  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2 A.3  Unnamed: 3 A.4  \
 0   168591  R       396.0   R       454.0   R        87.0   R         384   R   
 1   168607  R       201.0   R       537.0   R        37.0   R         246   R   
 2   168740  R       250.0   R       354.0   R        71.0   R         279   R   
 3   168786  R       278.0   R       446.0   R        62.0   R         257   R   
 4   168847  R       347.0   R      1703.0   R        20.0   R         170   R   
 ..     ... ..         ...  ..         ...  ..         ...  ..         ...  ..   
 66  171535  R      1571.0   R      6722.0   R        23.0   R        1044   R   
 67  171571  R      2213.0   R      3971.0   R        56.0   R        2633   R   
 68  171599  R       290.0   R       452.0   R        64.0   R         338   R   
 69  171775  A         NaN   A         NaN   A         NaN   R           5   R   
 70  171881  R        17.0   R        59.0   R        29.0   R          11   R   
 
     ...  A.11 Unnamed: 11  A.12 Unnamed: 12  A.13 Unnamed: 13  A.14  \
 0   ...     R         0.0     R         3.0     R         0.0     R   
 1   ...     R         0.0     R        83.0     R        37.0     R   
 2   ...     R         0.0     R         3.0     R         2.0     R   
 3   ...     R         0.0     R         5.0     R         1.0     R   
 4   ...     R         0.0     R        82.0     R        30.0     R   
 ..  ...   ...         ...   ...         ...   ...         ...   ...   
 66  ...     R         0.0     R      1288.0     R       741.0     R   
 67  ...     R         0.0     R        34.0     R        23.0     R   
 68  ...     R         0.0     R         1.0     R         1.0     R   
 69  ...     R         0.0     R         5.0     R         5.0     R   
 70  ...     R         0.0     R         7.0     R         1.0     R   
 
    Unnamed: 14  R  25  
 0          0.0  R  13  
 1         45.0  R  13  
 2         67.0  R  10  
 3         20.0  R  11  
 4         37.0  R   8  
 ..         ... ..  ..  
 66        58.0  R  23  
 67        68.0  R  19  
 68       100.0  R  15  
 69       100.0  R  13  
 70        14.0  R   9  
 
 [71 rows x 33 columns],
     171988  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      34 R.1  ...  \
 0   171997  A         NaN   A         NaN   A         NaN  R    61.0   R  ...   
 1   172015  A         NaN   A         NaN   A         NaN  R    54.0   R  ...   
 2   172033  R         2.0   R        17.0   R        12.0  R     2.0   R  ...   
 3   172051  R      1350.0   R      1821.0   R        74.0  R  1458.0   R  ...   
 4   172200  R       649.0   R      2700.0   R        24.0  R     0.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  174437  R       206.0   R       326.0   R        63.0  R   245.0   R  ...   
 67  174473  R       289.0   R      1202.0   R        24.0  R   344.0   R  ...   
 68  174491  R       361.0   R      1591.0   R        23.0  R   399.0   R  ...   
 69  174507  R        10.0   R       127.0   R         8.0  R     0.0   R  ...   
 70  174525  R        24.0   R        54.0   R        44.0  R     7.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.5  18  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R  22  
 1     A        NaN    A        NaN    A        NaN    A        NaN    R  23  
 2     R        0.0    R       10.0    R        6.0    R       60.0    R   8  
 3     R        0.0    R       12.0    R        8.0    R       67.0    R  17  
 4     R        0.0    R        0.0    R        0.0    A        NaN    R  21  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    R        0.0    R        0.0    R        0.0    A        NaN    R  15  
 67    R        0.0    R       80.0    R       37.0    R       46.0    R  18  
 68    R        0.0    R        5.0    R        4.0    R       80.0    R  18  
 69    A        NaN    A        NaN    A        NaN    A        NaN    R   7  
 70    R        0.0    R        0.0    R        0.0    A        NaN    R  10  
 
 [71 rows x 33 columns],
     174570  R      94 R.1    1016 R.2     9 R.3     120 R.4  ...  R.11  0.3  \
 0   174604  R    44.0   R    98.0   R  45.0   R    45.0   R  ...     R  0.0   
 1   174738  R   588.0   R  2231.0   R  26.0   R   637.0   R  ...     R  0.0   
 2   174747  R   400.0   R   427.0   R  94.0   R   438.0   R  ...     A  NaN   
 3   174756  R   369.0   R  1565.0   R  24.0   R   495.0   R  ...     R  0.0   
 4   174783  R   939.0   R  3823.0   R  25.0   R  1200.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  176798  A     NaN   A     NaN   A   NaN   R    20.0   R  ...     R  0.0   
 67  176910  R    30.0   R    48.0   R  63.0   R    37.0   R  ...     R  0.0   
 68  176947  R   343.0   R   434.0   R  79.0   R   318.0   R  ...     R  0.0   
 69  176965  R  1063.0   R  2120.0   R  50.0   R  1227.0   R  ...     R  0.0   
 70  176983  A     NaN   A     NaN   A   NaN   R    43.0   R  ...     R  0.0   
 
     R.12   29.1  R.13     14  R.14   48.1  R.15  25  
 0      R    4.0     R    1.0     R   25.0     R  18  
 1      R  243.0     R  123.0     R   51.0     R  22  
 2      A    NaN     A    NaN     A    NaN     R  12  
 3      R  296.0     R  150.0     R   51.0     R  21  
 4      R   24.0     R   12.0     R   50.0     R  19  
 ..   ...    ...   ...    ...   ...    ...   ...  ..  
 66     R   45.0     R   37.0     R   82.0     R  12  
 67     R    0.0     R    0.0     A    NaN     R   8  
 68     R    3.0     R    1.0     R   33.0     R  12  
 69     R   22.0     R    9.0     R   41.0     R  15  
 70     R    8.0     R    8.0     R  100.0     R   3  
 
 [71 rows x 33 columns],
     177038  R       2 R.1      37 R.2     5 R.3       1 R.4  ...  Z.1  0.6  \
 0   177065  R   264.0   R  1681.0   R  16.0   R   212.0   R  ...    R  0.0   
 1   177083  R     3.0   R    10.0   R  30.0   R     4.0   R  ...    A  NaN   
 2   177117  R   102.0   R   122.0   R  84.0   R    92.0   R  ...    R  0.0   
 3   177135  R   793.0   R  1212.0   R  65.0   R   788.0   R  ...    R  0.0   
 4   177144  R   254.0   R   345.0   R  74.0   R   254.0   R  ...    R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...  ...  ...   
 66  179566  R  2482.0   R  7410.0   R  33.0   R  2641.0   R  ...    R  0.0   
 67  179645  R   534.0   R  1143.0   R  47.0   R   530.0   R  ...    R  0.0   
 68  179715  R   327.0   R   482.0   R  68.0   R   326.0   R  ...    R  0.0   
 69  179867  R  1801.0   R  2221.0   R  81.0   R  1730.0   R  ...    R  0.0   
 70  179894  R   397.0   R   652.0   R  61.0   R   387.0   R  ...    R  0.0   
 
     R.10    0.7  Z.2   0.8  A Unnamed: 0  R.11   7  
 0      R  144.0    R  39.0  R       27.0     R  23  
 1      A    NaN    A   NaN  A        NaN     R   4  
 2      R    0.0    R   0.0  A        NaN     R   6  
 3      R  156.0    R  56.0  R       36.0     R  13  
 4      R    2.0    R   2.0  R      100.0     R  14  
 ..   ...    ...  ...   ... ..        ...   ...  ..  
 66     R   38.0    R  10.0  R       26.0     R  20  
 67     R  160.0    R  62.0  R       39.0     R  19  
 68     R   58.0    R  31.0  R       53.0     R  19  
 69     R    2.0    R   1.0  R       50.0     R   7  
 70     R   10.0    R   6.0  R       60.0     R  12  
 
 [71 rows x 33 columns],
     179946  R     177 R.1     214 R.2    83 R.3     142 R.4  ...  A.2  \
 0   179955  R   195.0   R   245.0   R  80.0   R   167.0   Z  ...    Z   
 1   179964  R   158.0   R   298.0   R  53.0   R   155.0   R  ...    R   
 2   179991  A     NaN   A     NaN   A   NaN   R    21.0   R  ...    A   
 3   180054  R    66.0   R   114.0   R  58.0   R    77.0   R  ...    R   
 4   180063  A     NaN   A     NaN   A   NaN   R    17.0   R  ...    A   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...  ...   
 66  182500  R   891.0   R  2030.0   R  44.0   R     0.0   R  ...    R   
 67  182564  R   329.0   R   846.0   R  39.0   R     0.0   R  ...    R   
 68  182634  R   233.0   R   313.0   R  74.0   R   234.0   R  ...    R   
 69  182652  A     NaN   A     NaN   A   NaN   R    15.0   R  ...    A   
 70  182670  R  1057.0   R  1124.0   R  94.0   R  1190.0   R  ...    A   
 
    Unnamed: 2  A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9  10  
 0         0.0    R        1.0    R        0.0    R        0.0    R  10  
 1         0.0    R        6.0    R        1.0    R       17.0    R  10  
 2         NaN    A        NaN    A        NaN    A        NaN    R  17  
 3         0.0    R        6.0    R        4.0    R       67.0    R  29  
 4         NaN    A        NaN    A        NaN    A        NaN    R   6  
 ..        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66        0.0    R        0.0    Z        0.0    A        NaN    R  21  
 67        0.0    R        0.0    R        0.0    A        NaN    R  18  
 68        0.0    R        0.0    R        0.0    A        NaN    R  12  
 69        NaN    A        NaN    A        NaN    A        NaN    R  11  
 70        NaN    A        NaN    A        NaN    A        NaN    R   7  
 
 [71 rows x 33 columns],
     182704  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R       1 R.1  ...  \
 0   182722  A         NaN   A         NaN   A         NaN  R    13.0   R  ...   
 1   182795  R       389.0   R       420.0   R        93.0  R   416.0   R  ...   
 2   182892  A         NaN   A         NaN   A         NaN  R    36.0   R  ...   
 3   182917  R        20.0   R        22.0   R        91.0  R    31.0   R  ...   
 4   182953  A         NaN   A         NaN   A         NaN  R    35.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  185536  R      1647.0   R      3111.0   R        53.0  R  1977.0   R  ...   
 67  185572  R       921.0   R      1135.0   R        81.0  R   994.0   R  ...   
 68  185590  R      3125.0   R      4588.0   R        68.0  R  3082.0   R  ...   
 69  185679  A         NaN   A         NaN   A         NaN  R     3.0   Z  ...   
 70  185721  R       120.0   R       189.0   R        63.0  R   124.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6  10  
 0     R        0.0    R        5.0    R        4.0    R       80.0    R  16  
 1     R        0.0    R        0.0    R        0.0    A        NaN    R  12  
 2     R        0.0    R        8.0    R        7.0    R       88.0    R  12  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 4     R        0.0    R       42.0    R       36.0    R       86.0    R  10  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    Z        0.0    R      707.0    R      318.0    R       45.0    R  21  
 67    R        0.0    R        0.0    R        0.0    A        NaN    R  12  
 68    R        0.0    R       19.0    R       12.0    R       63.0    R  18  
 69    Z        0.0    R        1.0    R        1.0    R      100.0    R  10  
 70    Z        0.0    R       35.0    R       24.0    R       69.0    R  15  
 
 [71 rows x 33 columns],
     185767  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R       3 R.1  ...  \
 0   185828  R      1129.0   R      2759.0   R        41.0  R  1299.0   R  ...   
 1   185873  R      1481.0   R      2806.0   R        53.0  R  1630.0   R  ...   
 2   185970  A         NaN   A         NaN   A         NaN  R    18.0   R  ...   
 3   186016  A         NaN   A         NaN   A         NaN  R    29.0   Z  ...   
 4   186034  R       291.0   R       739.0   R        39.0  R   617.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  188669  A         NaN   A         NaN   A         NaN  R    35.0   R  ...   
 67  188678  R        56.0   R        66.0   R        85.0  R   124.0   R  ...   
 68  188687  R        15.0   R       124.0   R        12.0  R    16.0   Z  ...   
 69  188696  A         NaN   A         NaN   A         NaN  R    66.0   Z  ...   
 70  188854  R       416.0   R       485.0   R        86.0  R   163.0   Z  ...   
 
     R.8  0.3  R.9    6.1  R.10      5  R.11    83  R.12  24  
 0     Z  0.0    R   44.0     R   30.0     R  68.0     R  15  
 1     R  0.0    R  389.0     R  158.0     R  41.0     R  18  
 2     R  0.0    R    3.0     R    1.0     R  33.0     R  12  
 3     Z  0.0    R   12.0     R   10.0     R  83.0     R  15  
 4     R  0.0    R  363.0     R  159.0     R  44.0     R  19  
 ..  ...  ...  ...    ...   ...    ...   ...   ...   ...  ..  
 66    A  NaN    A    NaN     A    NaN     A   NaN     R   3  
 67    A  NaN    A    NaN     A    NaN     A   NaN     R   8  
 68    Z  0.0    R   33.0     R   15.0     R  45.0     R  22  
 69    A  NaN    A    NaN     A    NaN     A   NaN     R  18  
 70    A  NaN    A    NaN     A    NaN     A   NaN     R   5  
 
 [71 rows x 33 columns],
     188890  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     659 R.1  ...  \
 0   188915  A         NaN   A         NaN   A         NaN  R     1.0   R  ...   
 1   188942  R        96.0   R       192.0   R        50.0  R    68.0   R  ...   
 2   189088  R       363.0   R       574.0   R        63.0  R   482.0   R  ...   
 3   189097  R       676.0   R       823.0   R        82.0  R   624.0   R  ...   
 4   189219  A         NaN   A         NaN   A         NaN  A     NaN   A  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  190983  R       173.0   R       201.0   R        86.0  R   217.0   R  ...   
 67  191083  R      1479.0   R      2911.0   R        51.0  R  1770.0   Z  ...   
 68  191126  R      1247.0   R      1956.0   R        64.0  R  1383.0   R  ...   
 69  191199  R       898.0   R      1437.0   R        62.0  R   944.0   Z  ...   
 70  191205  R       154.0   R       205.0   R        75.0  R   129.0   R  ...   
 
     Z.2  0.2  R.6  258.1  R.7   175  R.8     68  R.9  18  
 0     A  NaN    A    NaN    A   NaN    A    NaN    R   4  
 1     A  NaN    A    NaN    A   NaN    A    NaN    R  19  
 2     A  NaN    A    NaN    A   NaN    A    NaN    R   9  
 3     R  0.0    R    8.0    R   8.0    R  100.0    R   9  
 4     A  NaN    A    NaN    A   NaN    A    NaN    R  16  
 ..  ...  ...  ...    ...  ...   ...  ...    ...  ...  ..  
 66    A  NaN    A    NaN    A   NaN    A    NaN    R  10  
 67    Z  0.0    R  218.0    R  76.0    R   35.0    R  17  
 68    R  0.0    R   20.0    R  12.0    R   60.0    R  17  
 69    Z  0.0    R   81.0    R  38.0    R   47.0    R  23  
 70    R  0.0    R    2.0    R   1.0    R   50.0    R  10  
 
 [71 rows x 33 columns],
     191241  R   2071 R.1   2463 R.2    84 R.3   2260 R.4  ...  R.11  0.3  \
 0   191287  A    NaN   A    NaN   A   NaN   A    NaN   A  ...     A  NaN   
 1   191302  R  294.0   R  364.0   R  81.0   R  416.0   Z  ...     Z  0.0   
 2   191311  A    NaN   A    NaN   A   NaN   A    NaN   A  ...     A  NaN   
 3   191339  R  532.0   R  967.0   R  55.0   R  655.0   Z  ...     Z  0.0   
 4   191515  R  438.0   R  450.0   R  97.0   R  473.0   R  ...     A  NaN   
 ..     ... ..    ...  ..    ...  ..   ...  ..    ...  ..  ...   ...  ...   
 66  193973  R  623.0   R  748.0   R  83.0   R  626.0   R  ...     Z  0.0   
 67  193991  A    NaN   A    NaN   A   NaN   R    5.0   R  ...     Z  0.0   
 68  194028  R  197.0   R  598.0   R  33.0   R  280.0   Z  ...     Z  0.0   
 69  194091  R  687.0   R  929.0   R  74.0   R  816.0   R  ...     R  0.0   
 70  194116  R   13.0   R  176.0   R   7.0   R    6.0   Z  ...     Z  0.0   
 
     R.12  10.1  R.13     6  R.14     60  R.15  13  
 0      A   NaN     A   NaN     A    NaN     P  10  
 1      R  63.0     R  31.0     R   49.0     R  22  
 2      A   NaN     A   NaN     A    NaN     R   5  
 3      R  65.0     R  32.0     R   49.0     R  15  
 4      A   NaN     A   NaN     A    NaN     R   9  
 ..   ...   ...   ...   ...   ...    ...   ...  ..  
 66     R   0.0     R   0.0     A    NaN     R  11  
 67     R   1.0     R   1.0     R  100.0     R  11  
 68     R  23.0     R  11.0     R   48.0     R  14  
 69     R  33.0     R   5.0     R   15.0     R  11  
 70     R   0.0     Z   0.0     A    NaN     R  11  
 
 [71 rows x 33 columns],
     194161  R     116 R.1     498 R.2     23 R.3      83 R.4  ...  R.11  0.3  \
 0   194189  R     1.0   R     1.0   R  100.0   R     0.0   R  ...     A  NaN   
 1   194222  R  1347.0   R  2161.0   R   62.0   R  1647.0   Z  ...     Z  0.0   
 2   194240  R   920.0   R  1463.0   R   63.0   R   913.0   Z  ...     Z  0.0   
 3   194259  A     NaN   A     NaN   A    NaN   R    19.0   R  ...     A  NaN   
 4   194310  R  1773.0   R  2290.0   R   77.0   R  1915.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..    ...  ..     ...  ..  ...   ...  ...   
 66  196264  R   102.0   R  2219.0   R    5.0   R    57.0   R  ...     R  0.0   
 67  196291  R   341.0   R   436.0   R   78.0   R   344.0   Z  ...     Z  0.0   
 68  196307  A     NaN   A     NaN   A    NaN   A     NaN   A  ...     A  NaN   
 69  196389  R    78.0   R   181.0   R   43.0   R   325.0   R  ...     R  0.0   
 70  196413  R  3486.0   R  3877.0   R   90.0   R  3641.0   R  ...     Z  0.0   
 
     R.12    6.1  R.13      1  R.14    17  R.15  16  
 0      A    NaN     A    NaN     A   NaN     R  15  
 1      R  169.0     R   40.0     R  24.0     R  19  
 2      R  211.0     R  110.0     R  52.0     R  18  
 3      A    NaN     A    NaN     A   NaN     R  13  
 4      R   31.0     R   18.0     R  58.0     R  14  
 ..   ...    ...   ...    ...   ...   ...   ...  ..  
 66     R   19.0     R    9.0     R  47.0     R  18  
 67     R    1.0     R    0.0     R   0.0     R  16  
 68     A    NaN     A    NaN     A   NaN     R   6  
 69     R   14.0     R    2.0     R  14.0     R  12  
 70     R   19.0     R    7.0     R  37.0     R  15  
 
 [71 rows x 33 columns],
     196431  R     42 R.1     227 R.2     19 R.3     50 R.4  ...  A.2  \
 0   196440  R    2.0   R     2.0   R  100.0   R    3.0   R  ...    A   
 1   196565  R  337.0   R   562.0   R   60.0   R  537.0   R  ...    Z   
 2   196583  R    3.0   R    25.0   R   12.0   R    7.0   R  ...    A   
 3   196592  R  526.0   R  1771.0   R   30.0   R  340.0   Z  ...    Z   
 4   196653  R   88.0   R   416.0   R   21.0   R    1.0   R  ...    R   
 ..     ... ..    ...  ..     ...  ..    ...  ..    ...  ..  ...  ...   
 66  198561  R  470.0   R   728.0   R   65.0   R  401.0   R  ...    Z   
 67  198570  R  242.0   R  2087.0   R   12.0   R  296.0   R  ...    R   
 68  198598  R  133.0   R   227.0   R   59.0   R  181.0   R  ...    A   
 69  198613  R  372.0   R   461.0   R   81.0   R  380.0   R  ...    R   
 70  198622  R  862.0   R  3781.0   R   23.0   R  932.0   R  ...    R   
 
    Unnamed: 2  A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9  14  
 0         NaN    A        NaN    A        NaN    A        NaN    R   8  
 1         0.0    R       39.0    R       16.0    R       41.0    R  11  
 2         NaN    A        NaN    A        NaN    A        NaN    R   9  
 3         0.0    R       36.0    R       21.0    R       58.0    R  11  
 4         0.0    R        0.0    R        0.0    A        NaN    R  11  
 ..        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66        0.0    R        0.0    Z        0.0    A        NaN    R  13  
 67        0.0    R     1427.0    R      701.0    R       49.0    R  11  
 68        NaN    A        NaN    A        NaN    A        NaN    R   8  
 69        0.0    R        0.0    R        0.0    A        NaN    R  12  
 70        0.0    R      898.0    R      369.0    R       41.0    R  18  
 
 [71 rows x 33 columns],
     198640  R      29 R.1     329 R.2     9 R.3     128 R.4  ...  R.11  0.3  \
 0   198668  R   170.0   R   570.0   R  30.0   R   203.0   R  ...     R  0.0   
 1   198677  R     1.0   R     4.0   R  25.0   R     0.0   Z  ...     Z  0.0   
 2   198695  R  1387.0   R  1470.0   R  94.0   R  1400.0   R  ...     A  NaN   
 3   198710  R   199.0   R   739.0   R  27.0   R   200.0   R  ...     R  0.0   
 4   198729  R    72.0   R   373.0   R  19.0   R    75.0   Z  ...     Z  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  199847  R  1452.0   R  1472.0   R  99.0   R  1359.0   R  ...     R  0.0   
 67  199856  R  1700.0   R  7170.0   R  24.0   R  2085.0   R  ...     R  0.0   
 68  199865  R   194.0   R   226.0   R  86.0   R   235.0   R  ...     A  NaN   
 69  199883  A     NaN   A     NaN   A   NaN   A     NaN   A  ...     A  NaN   
 70  199892  R   152.0   R   374.0   R  41.0   R   281.0   R  ...     R  0.0   
 
     R.12    33.1  R.13     10  R.14     30  R.15  11  
 0      R    57.0     R   28.0     R   49.0     R   8  
 1      R     0.0     Z    0.0     A    NaN     R   5  
 2      A     NaN     A    NaN     A    NaN     R  17  
 3      R   113.0     R   68.0     R   60.0     R  12  
 4      R    53.0     R   22.0     R   42.0     R   9  
 ..   ...     ...   ...    ...   ...    ...   ...  ..  
 66     R     1.0     R    1.0     R  100.0     R  11  
 67     R  1819.0     R  926.0     R   51.0     R  18  
 68     A     NaN     A    NaN     A    NaN     R   9  
 69     A     NaN     A    NaN     A    NaN     R  11  
 70     R   163.0     R   84.0     R   52.0     R  12  
 
 [71 rows x 33 columns],
     199908  R      74 R.1     561 R.2    13 R.3     122 R.4  ...  R.11  0.3  \
 0   199926  R   294.0   R   812.0   R  36.0   R   342.0   R  ...     R  0.0   
 1   199953  R    86.0   R   438.0   R  20.0   R    93.0   R  ...     R  0.0   
 2   199962  R   815.0   R   897.0   R  91.0   R   942.0   R  ...     R  0.0   
 3   199971  R    34.0   R    38.0   R  89.0   R    10.0   R  ...     A  NaN   
 4   199980  A     NaN   A     NaN   A   NaN   R    32.0   R  ...     A  NaN   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  202046  R   149.0   R   176.0   R  85.0   R   155.0   R  ...     R  0.0   
 67  202073  R    54.0   R    60.0   R  90.0   R    54.0   R  ...     A  NaN   
 68  202134  R  1752.0   R  2972.0   R  59.0   R  1943.0   R  ...     R  0.0   
 69  202152  A     NaN   A     NaN   A   NaN   R    63.0   R  ...     A  NaN   
 70  202170  R   222.0   R   286.0   R  78.0   R   267.0   R  ...     R  0.0   
 
     R.12   69.1  R.13    39  R.14    57  R.15  10  
 0      R  122.0     R  63.0     R  52.0     R   8  
 1      R   51.0     R  15.0     R  29.0     R  11  
 2      R    1.0     R   0.0     R   0.0     R  15  
 3      A    NaN     A   NaN     A   NaN     R  13  
 4      A    NaN     A   NaN     A   NaN     R  20  
 ..   ...    ...   ...   ...   ...   ...   ...  ..  
 66     R    0.0     R   0.0     A   NaN     R   9  
 67     A    NaN     A   NaN     A   NaN     R   6  
 68     R   24.0     R   8.0     R  33.0     R  15  
 69     A    NaN     A   NaN     A   NaN     R  17  
 70     R    1.0     R   0.0     R   0.0     R  10  
 
 [71 rows x 33 columns],
     202222  R    1975 R.1   11693 R.2    17 R.3    2006 R.4  ...  R.11  0.3  \
 0   202356  R  1098.0   R  3483.0   R  32.0   R  1497.0   R  ...     R  0.0   
 1   202435  R    13.0   R    34.0   R  38.0   R     1.0   R  ...     R  0.0   
 2   202453  A     NaN   A     NaN   A   NaN   R    20.0   R  ...     R  0.0   
 3   202480  R  2121.0   R  2300.0   R  92.0   R  2029.0   R  ...     R  0.0   
 4   202514  R   217.0   R   262.0   R  83.0   R   180.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  204486  R   901.0   R  1033.0   R  87.0   R   118.0   R  ...     R  0.0   
 67  204501  R   676.0   R   697.0   R  97.0   R   797.0   R  ...     A  NaN   
 68  204608  A     NaN   A     NaN   A   NaN   R   550.0   Z  ...     A  NaN   
 69  204617  R   202.0   R   285.0   R  71.0   R   285.0   R  ...     R  0.0   
 70  204635  R   579.0   R   664.0   R  87.0   R   608.0   R  ...     A  NaN   
 
     R.12  2118.1  R.13    974  R.14     46  R.15  20  
 0      R  1232.0     R  363.0     R   29.0     R  16  
 1      R    14.0     R    6.0     R   43.0     R   8  
 2      R     6.0     R    4.0     R   67.0     R  25  
 3      R     6.0     R    5.0     R   83.0     R  15  
 4      R     1.0     R    0.0     R    0.0     R  12  
 ..   ...     ...   ...    ...   ...    ...   ...  ..  
 66     R     0.0     R    0.0     A    NaN     R  23  
 67     A     NaN     A    NaN     A    NaN     R   9  
 68     A     NaN     A    NaN     A    NaN     R  12  
 69     R     2.0     R    2.0     R  100.0     R  15  
 70     A     NaN     A    NaN     A    NaN     R  11  
 
 [71 rows x 33 columns],
     204662  R     233 R.1     280 R.2    83 R.3     244 R.4  ...  R.11  0.3  \
 0   204671  R   321.0   R   504.0   R  64.0   R   356.0   R  ...     R  0.0   
 1   204680  R   343.0   R   493.0   R  70.0   R   421.0   R  ...     R  0.0   
 2   204699  R   367.0   R   530.0   R  69.0   R   464.0   R  ...     R  0.0   
 3   204705  R  1331.0   R  1708.0   R  78.0   R  1494.0   R  ...     R  0.0   
 4   204714  A     NaN   A     NaN   A   NaN   R    27.0   R  ...     A  NaN   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  206996  R   413.0   R   791.0   R  52.0   R   479.0   R  ...     R  0.0   
 67  207041  R   444.0   R   781.0   R  57.0   R   480.0   R  ...     R  0.0   
 68  207050  R   224.0   R   537.0   R  42.0   R   270.0   R  ...     R  0.0   
 69  207069  R   231.0   R  1191.0   R  19.0   R   222.0   R  ...     R  0.0   
 70  207087  A     NaN   A     NaN   A   NaN   R     2.0   R  ...     A  NaN   
 
     R.12    3.1  R.13     2  R.14    67  R.15  15  
 0      R    6.0     R   3.0     R  50.0     R  20  
 1      R    8.0     R   4.0     R  50.0     R  18  
 2      R    8.0     R   2.0     R  25.0     R  18  
 3      R   63.0     R  20.0     R  32.0     R  29  
 4      A    NaN     A   NaN     A   NaN     R  14  
 ..   ...    ...   ...   ...   ...   ...   ...  ..  
 66     R  140.0     R  54.0     R  39.0     R  23  
 67     R   14.0     R   3.0     R  21.0     R  18  
 68     R   65.0     R  27.0     R  42.0     R  18  
 69     R   87.0     R  34.0     R  39.0     R  19  
 70     A    NaN     A   NaN     A   NaN     R  12  
 
 [71 rows x 33 columns],
     207102  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     27 R.1  ...  \
 0   207157  R        32.0   R        67.0   R        48.0  R   54.0   Z  ...   
 1   207209  R       645.0   R       719.0   R        90.0  R  346.0   R  ...   
 2   207236  R       361.0   R      1396.0   R        26.0  R  359.0   R  ...   
 3   207254  A         NaN   A         NaN   A         NaN  R    0.0   R  ...   
 4   207263  R       661.0   R      1809.0   R        37.0  R  717.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  209667  A         NaN   A         NaN   A         NaN  R   44.0   Z  ...   
 67  209676  A         NaN   A         NaN   A         NaN  R   86.0   R  ...   
 68  209694  A         NaN   A         NaN   A         NaN  R   44.0   R  ...   
 69  209700  A         NaN   A         NaN   A         NaN  R   38.0   R  ...   
 70  209719  A         NaN   A         NaN   A         NaN  R   49.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6  16  
 0     Z        0.0    R        1.0    R        1.0    R      100.0    R  22  
 1     R        0.0    R        5.0    R        0.0    R        0.0    R  14  
 2     R        0.0    R       78.0    R       28.0    R       36.0    R  17  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R   9  
 4     R        0.0    R       16.0    R        7.0    R       44.0    R  17  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 67    A        NaN    A        NaN    A        NaN    A        NaN    R  12  
 68    A        NaN    A        NaN    A        NaN    A        NaN    R   7  
 69    A        NaN    A        NaN    A        NaN    A        NaN    R  17  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 
 [71 rows x 33 columns],
     209746  R    1928 R.1    5899 R.2     33 R.3    2175 R.4  ...  R.11  0.2  \
 0   209807  R  1432.0   R  5893.0   R   24.0   R  1576.0   R  ...     R  0.0   
 1   209825  R   803.0   R   893.0   R   90.0   R  1003.0   R  ...     R  0.0   
 2   209922  R   363.0   R   399.0   R   91.0   R   394.0   R  ...     Z  0.0   
 3   209940  R   334.0   R   872.0   R   38.0   R   519.0   Z  ...     Z  0.0   
 4   210146  R   529.0   R  1545.0   R   34.0   R   622.0   Z  ...     Z  0.0   
 ..     ... ..     ...  ..     ...  ..    ...  ..     ...  ..  ...   ...  ...   
 66  212337  A     NaN   A     NaN   A    NaN   R    23.0   R  ...     Z  0.0   
 67  212355  A     NaN   A     NaN   A    NaN   R    67.0   R  ...     Z  0.0   
 68  212382  A     NaN   A     NaN   A    NaN   R    37.0   R  ...     Z  0.0   
 69  212391  A     NaN   A     NaN   A    NaN   R    35.0   R  ...     Z  0.0   
 70  212434  R    99.0   R    99.0   R  100.0   R    94.0   R  ...     A  NaN   
 
     R.12 1167.1  R.13    476  R.14    41  R.15  20  
 0      R   98.0     R   39.0     R  40.0     R  19  
 1      R    0.0     R    0.0     A   NaN     R  11  
 2      R    0.0     Z    0.0     A   NaN     R   9  
 3      R  538.0     R  183.0     R  34.0     R  17  
 4      R   16.0     R    7.0     R  44.0     R  20  
 ..   ...    ...   ...    ...   ...   ...   ...  ..  
 66     R   17.0     R    9.0     R  53.0     R  25  
 67     R    8.0     R    4.0     R  50.0     R  21  
 68     R   14.0     R   10.0     R  71.0     R  25  
 69     R    7.0     R    4.0     R  57.0     R  15  
 70     A    NaN     A    NaN     A   NaN     R   7  
 
 [71 rows x 33 columns],
     212577  R    553 R.1     575 R.2    96 R.3    627 R.4  ...  A.2  \
 0   212601  R  629.0   R  1004.0   R  63.0   R  691.0   R  ...    R   
 1   212656  R  275.0   R   378.0   R  73.0   R  310.0   R  ...    R   
 2   212674  R  640.0   R   654.0   R  98.0   R  681.0   R  ...    A   
 3   212753  A    NaN   A     NaN   A   NaN   R   35.0   R  ...    Z   
 4   212771  A    NaN   A     NaN   A   NaN   A    NaN   A  ...    A   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ...  ...   
 66  214944  A    NaN   A     NaN   A   NaN   R  130.0   R  ...    R   
 67  214971  R   19.0   R    34.0   R  56.0   R   31.0   R  ...    R   
 68  215008  A    NaN   A     NaN   A   NaN   R   17.0   R  ...    A   
 69  215044  A    NaN   A     NaN   A   NaN   A    NaN   A  ...    A   
 70  215053  R   72.0   R    84.0   R  86.0   R   64.0   R  ...    A   
 
    Unnamed: 2  A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9   9  
 0         0.0    R        5.0    R        2.0    R       40.0    R  13  
 1         0.0    R        4.0    R        2.0    R       50.0    R  12  
 2         NaN    A        NaN    A        NaN    A        NaN    R  10  
 3         0.0    R        6.0    R        2.0    R       33.0    R  32  
 4         NaN    A        NaN    A        NaN    A        NaN    R  12  
 ..        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66        0.0    R       46.0    R       26.0    R       57.0    R  11  
 67        0.0    R        1.0    R        0.0    R        0.0    R   5  
 68        NaN    A        NaN    A        NaN    A        NaN    R  15  
 69        NaN    A        NaN    A        NaN    A        NaN    R  15  
 70        NaN    A        NaN    A        NaN    A        NaN    R   8  
 
 [71 rows x 33 columns],
     215062  R    2285 R.1    2716 R.2     84 R.3    2341 R.4  ...  R.11  0.2  \
 0   215105  R   297.0   R   359.0   R   83.0   R   401.0   R  ...     Z  0.0   
 1   215114  R   207.0   R   273.0   R   76.0   R   177.0   R  ...     R  0.0   
 2   215132  R   316.0   R   404.0   R   78.0   R   426.0   R  ...     R  0.0   
 3   215239  R  1153.0   R  3227.0   R   36.0   R  1524.0   R  ...     R  0.0   
 4   215266  R   366.0   R   442.0   R   83.0   R   318.0   R  ...     A  NaN   
 ..     ... ..     ...  ..     ...  ..    ...  ..     ...  ..  ...   ...  ...   
 66  217040  R     9.0   R     9.0   R  100.0   R    15.0   R  ...     A  NaN   
 67  217059  R   846.0   R  1024.0   R   83.0   R   942.0   R  ...     R  0.0   
 68  217077  A     NaN   A     NaN   A    NaN   R   124.0   Z  ...     A  NaN   
 69  217156  R  1751.0   R  1905.0   R   92.0   R  1660.0   R  ...     A  NaN   
 70  217165  R   804.0   R   884.0   R   91.0   R   854.0   R  ...     R  0.0   
 
     R.12    48.1  R.13     45  R.14    94  R.15   6  
 0      R     0.0     R    0.0     A   NaN     R   9  
 1      R     0.0     R    0.0     A   NaN     R  11  
 2      R     0.0     Z    0.0     A   NaN     R  12  
 3      R  1879.0     R  708.0     R  38.0     R  18  
 4      A     NaN     A    NaN     A   NaN     R  17  
 ..   ...     ...   ...    ...   ...   ...   ...  ..  
 66     A     NaN     A    NaN     A   NaN     R   8  
 67     R     4.0     R    0.0     R   0.0     R  14  
 68     A     NaN     A    NaN     A   NaN     R  13  
 69     A     NaN     A    NaN     A   NaN     R   6  
 70     R     0.0     R    0.0     A   NaN     R  13  
 
 [71 rows x 33 columns],
     217235  R    1201 R.1    1397 R.2    86 R.3    1400 R.4  ...  R.11  0.3  \
 0   217305  R   381.0   R   461.0   R  83.0   R     0.0   R  ...     A  NaN   
 1   217323  A     NaN   A     NaN   A   NaN   R   408.0   R  ...     A  NaN   
 2   217402  R   996.0   R  1112.0   R  90.0   R  1090.0   R  ...     R  0.0   
 3   217420  R   780.0   R  1372.0   R  57.0   R   919.0   R  ...     R  0.0   
 4   217475  R  2509.0   R  4716.0   R  53.0   R  2860.0   R  ...     R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  218894  R  1169.0   R  3718.0   R  31.0   R  1209.0   R  ...     Z  0.0   
 67  218919  R    56.0   R    78.0   R  72.0   R   163.0   R  ...     A  NaN   
 68  218955  R    33.0   R   197.0   R  17.0   R    54.0   R  ...     Z  0.0   
 69  218964  R   959.0   R  1372.0   R  70.0   R  1073.0   R  ...     R  0.0   
 70  218973  R   496.0   R   518.0   R  96.0   R   474.0   R  ...     A  NaN   
 
     R.12   13.1  R.13      3  R.14    23  R.15    16  
 0      A    NaN     A    NaN     A   NaN     R  10.0  
 1      A    NaN     A    NaN     A   NaN     R   8.0  
 2      R    4.0     R    2.0     R  50.0     R  11.0  
 3      R   34.0     R   15.0     R  44.0     R  14.0  
 4      R  879.0     R  327.0     R  37.0     R  20.0  
 ..   ...    ...   ...    ...   ...   ...   ...   ...  
 66     R  709.0     R  164.0     R  23.0     R  20.0  
 67     A    NaN     A    NaN     A   NaN     R  11.0  
 68     R   49.0     R   21.0     R  43.0     R  12.0  
 69     R    0.0     R    0.0     A   NaN     R  12.0  
 70     A    NaN     A    NaN     A   NaN     R  12.0  
 
 [71 rows x 33 columns],
     218991  R    663 R.1    1633 R.2    41 R.3    804 R.4  ...  Z  0.3  R.11  \
 0   219000  R  318.0   R   437.0   R  73.0   R  430.0   R  ...  R  0.0     R   
 1   219037  A    NaN   A     NaN   A   NaN   R   19.0   R  ...  R  0.0     R   
 2   219046  R  413.0   R  1333.0   R  31.0   R  436.0   R  ...  R  0.0     R   
 3   219082  R  360.0   R  1050.0   R  34.0   R  399.0   R  ...  R  0.0     R   
 4   219091  R  176.0   R   231.0   R  76.0   R  163.0   R  ...  A  NaN     A   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ... ..  ...   ...   
 66  220604  R   51.0   R   115.0   R  44.0   R  140.0   R  ...  R  0.0     R   
 67  220613  R  681.0   R  1525.0   R  45.0   R  787.0   R  ...  R  0.0     R   
 68  220631  R  226.0   R   577.0   R  39.0   R  231.0   R  ...  R  0.0     R   
 69  220640  A    NaN   A     NaN   A   NaN   R   54.0   R  ...  R  0.0     R   
 70  220701  R  185.0   R   295.0   R  63.0   R  233.0   R  ...  Z  0.0     R   
 
    193.1  R.12    55  R.13    28  R.14  19  
 0    0.0     R   0.0     A   NaN     R  12  
 1    0.0     R   0.0     A   NaN     R   7  
 2   32.0     R  15.0     R  47.0     R  16  
 3   21.0     R   7.0     R  33.0     R  18  
 4    NaN     A   NaN     A   NaN     R  12  
 ..   ...   ...   ...   ...   ...   ...  ..  
 66   5.0     R   1.0     R  20.0     R  11  
 67   5.0     R   0.0     R   0.0     R  15  
 68   3.0     R   1.0     R  33.0     R  13  
 69  13.0     R   9.0     R  69.0     R  25  
 70   0.0     Z   0.0     A   NaN     R  13  
 
 [71 rows x 33 columns],
     220710  R     271 R.1     390 R.2    69 R.3     272 R.4  ...  A.2  \
 0   220756  A     NaN   A     NaN   A   NaN   R    41.0   R  ...    R   
 1   220765  A     NaN   A     NaN   A   NaN   R    42.0   R  ...    R   
 2   220853  A     NaN   A     NaN   A   NaN   R   155.0   R  ...    R   
 3   220862  R  2448.0   R  5476.0   R  45.0   R  2617.0   R  ...    R   
 4   220978  R  3035.0   R  5023.0   R  60.0   R  3282.0   Z  ...    Z   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...  ...   
 66  223922  R   343.0   R  1062.0   R  32.0   R   204.0   R  ...    R   
 67  224004  R   217.0   R   397.0   R  55.0   R   233.0   R  ...    R   
 68  224013  A     NaN   A     NaN   A   NaN   R     8.0   R  ...    A   
 69  224110  R   425.0   R  2808.0   R  15.0   R   687.0   R  ...    R   
 70  224147  R  1448.0   R  2269.0   R  64.0   R  1744.0   R  ...    R   
 
    Unnamed: 2  A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9  12  
 0         0.0    R        2.0    R        2.0    R      100.0    R  14  
 1         0.0    R       14.0    R        7.0    R       50.0    R  20  
 2         0.0    R       18.0    R       10.0    R       56.0    R  20  
 3         0.0    R       66.0    R       37.0    R       56.0    R  16  
 4         0.0    R       30.0    R       20.0    R       67.0    R  17  
 ..        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66        0.0    R      470.0    R      232.0    R       49.0    R  19  
 67        0.0    R        2.0    R        2.0    R      100.0    R  12  
 68        NaN    A        NaN    A        NaN    A        NaN    R  25  
 69        0.0    R      721.0    R      320.0    R       44.0    R  18  
 70        0.0    R      171.0    R       82.0    R       48.0    R  20  
 
 [71 rows x 33 columns],
     224156  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      43  Z  ...  \
 0   224226  R       607.0   R       866.0   R        70.0  R   571.0  R  ...   
 1   224244  R        47.0   R       139.0   R        34.0  R    34.0  R  ...   
 2   224271  R        88.0   R       262.0   R        34.0  R    74.0  R  ...   
 3   224323  R       379.0   R       428.0   R        89.0  R   382.0  R  ...   
 4   224350  R       234.0   R      1088.0   R        22.0  R   505.0  R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ... ..  ...   
 66  227368  R      4851.0   R      6995.0   R        69.0  R  4477.0  R  ...   
 67  227377  R       683.0   R      3428.0   R        20.0  R   801.0  Z  ...   
 68  227386  R       286.0   R       920.0   R        31.0  R   293.0  R  ...   
 69  227401  R       557.0   R      1832.0   R        30.0  R   637.0  R  ...   
 70  227429  R        60.0   R        93.0   R        65.0  R   146.0  Z  ...   
 
     Z.4  0.4  R.4    0.5  R.5    0.6  A.3 Unnamed: 3  R.6  18  
 0     R  1.0    R    1.0    R    0.0    R        0.0    R  14  
 1     A  NaN    A    NaN    A    NaN    A        NaN    R   9  
 2     A  NaN    A    NaN    A    NaN    A        NaN    R  24  
 3     A  NaN    A    NaN    A    NaN    A        NaN    R  12  
 4     R  0.0    R  651.0    R  283.0    R       43.0    R  14  
 ..  ...  ...  ...    ...  ...    ...  ...        ...  ...  ..  
 66    R  0.0    R  316.0    R  199.0    R       63.0    R  21  
 67    R  0.0    R  631.0    R  331.0    R       52.0    R  20  
 68    R  0.0    R   51.0    R   23.0    R       45.0    R  14  
 69    R  0.0    R  178.0    R   68.0    R       38.0    R  25  
 70    Z  0.0    R    1.0    R    1.0    R      100.0    R  18  
 
 [71 rows x 33 columns],
     227526  R   1807 R.1    2344 R.2    77 R.3   1687 R.4  ...  R.11  0.3  \
 0   227687  R  296.0   R   908.0   R  33.0   R  260.0   Z  ...     Z  0.0   
 1   227748  A    NaN   A     NaN   A   NaN   R   13.0   Z  ...     A  NaN   
 2   227757  R  993.0   R  1071.0   R  93.0   R  961.0   R  ...     A  NaN   
 3   227845  R  594.0   R   785.0   R  76.0   R  695.0   R  ...     R  0.0   
 4   227854  R  362.0   R  1907.0   R  19.0   R  539.0   Z  ...     Z  0.0   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ...   ...  ...   
 66  230047  R  585.0   R   833.0   R  70.0   R  523.0   R  ...     Z  0.0   
 67  230056  R   40.0   R   166.0   R  24.0   R    0.0   R  ...     R  0.0   
 68  230065  A    NaN   A     NaN   A   NaN   R   30.0   R  ...     R  0.0   
 69  230144  A    NaN   A     NaN   A   NaN   R   21.0   R  ...     R  0.0   
 70  230162  A    NaN   A     NaN   A   NaN   R  118.0   R  ...     R  0.0   
 
     R.12   40.1  R.13      5  R.14     13  R.15  17  
 0      R  401.0     R  223.0     R   56.0     R  19  
 1      A    NaN     A    NaN     A    NaN     R   9  
 2      A    NaN     A    NaN     A    NaN     R   6  
 3      R    2.0     R    2.0     R  100.0     R  14  
 4      R  765.0     R  259.0     R   34.0     R  18  
 ..   ...    ...   ...    ...   ...    ...   ...  ..  
 66     R    1.0     Z    0.0     R    0.0     R  16  
 67     R    0.0     R    0.0     A    NaN     R  14  
 68     R   15.0     R   11.0     R   73.0     R  12  
 69     R    0.0     R    0.0     A    NaN     R  10  
 70     R  322.0     R  179.0     R   56.0     R  10  
 
 [71 rows x 33 columns],
     230171  R   2280 R.1   4126 R.2    55 R.3   1318 R.4  ...  R.11  0.1  \
 0   230199  A    NaN   A    NaN   A   NaN   R   35.0   R  ...     A  NaN   
 1   230205  A    NaN   A    NaN   A   NaN   R   17.0   R  ...     R  0.0   
 2   230214  A    NaN   A    NaN   A   NaN   R    7.0   Z  ...     Z  0.0   
 3   230366  R   35.0   R   67.0   R  52.0   R   19.0   R  ...     Z  0.0   
 4   230418  R  307.0   R  652.0   R  47.0   R  388.0   R  ...     R  0.0   
 ..     ... ..    ...  ..    ...  ..   ...  ..    ...  ..  ...   ...  ...   
 66  232706  R  360.0   R  622.0   R  58.0   R  391.0   R  ...     R  0.0   
 67  232724  A    NaN   A    NaN   A   NaN   A    NaN   A  ...     A  NaN   
 68  232788  R  279.0   R  485.0   R  58.0   R  304.0   R  ...     R  0.0   
 69  232797  R   38.0   R  207.0   R  18.0   R    2.0   R  ...     R  0.0   
 70  232867  R  438.0   R  773.0   R  57.0   R  515.0   R  ...     R  0.0   
 
     R.12    119  R.13    42  R.14    35  R.15  25  
 0      A    NaN     A   NaN     A   NaN     R  16  
 1      R    7.0     R   6.0     R  86.0     R  20  
 2      R   25.0     R  22.0     R  88.0     R  15  
 3      R    0.0     Z   0.0     A   NaN     R  19  
 4      R   73.0     R  34.0     R  47.0     R  16  
 ..   ...    ...   ...   ...   ...   ...   ...  ..  
 66     R    1.0     R   0.0     R   0.0     R  14  
 67     A    NaN     A   NaN     A   NaN     R   3  
 68     R   73.0     R  21.0     R  29.0     R  19  
 69     R    4.0     R   2.0     R  50.0     R  39  
 70     R  132.0     R  53.0     R  40.0     R  30  
 
 [71 rows x 33 columns],
     232885  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2 A.3  Unnamed: 3 A.4  \
 0   232919  A         NaN   A         NaN   A         NaN   R        89.0   R   
 1   232937  R      1043.0   R      1418.0   R        74.0   R      1197.0   R   
 2   232946  R      4974.0   R     10148.0   R        49.0   R      5127.0   R   
 3   232982  R      3078.0   R      5151.0   R        60.0   R      3107.0   R   
 4   233019  R       333.0   R       514.0   R        65.0   R       389.0   R   
 ..     ... ..         ...  ..         ...  ..         ...  ..         ...  ..   
 66  235352  A         NaN   A         NaN   A         NaN   R         7.0   R   
 67  235422  R        64.0   R       301.0   R        21.0   R        90.0   R   
 68  235431  R       419.0   R      1194.0   R        35.0   R         0.0   Z   
 69  235501  A         NaN   A         NaN   A         NaN   R        58.0   R   
 70  235547  A         NaN   A         NaN   A         NaN   A         NaN   A   
 
     ...  A.11 Unnamed: 11  A.12 Unnamed: 12  A.13 Unnamed: 13  A.14  \
 0   ...     A         NaN     A         NaN     A         NaN     A   
 1   ...     R         0.0     R         0.0     R         0.0     A   
 2   ...     R         0.0     R      3277.0     R      1752.0     R   
 3   ...     R         0.0     R        38.0     R        26.0     R   
 4   ...     R         0.0     R        68.0     R        28.0     R   
 ..  ...   ...         ...   ...         ...   ...         ...   ...   
 66  ...     A         NaN     A         NaN     A         NaN     A   
 67  ...     R         0.0     R        33.0     R        14.0     R   
 68  ...     Z         0.0     R         0.0     R         0.0     A   
 69  ...     A         NaN     A         NaN     A         NaN     A   
 70  ...     A         NaN     A         NaN     A         NaN     A   
 
    Unnamed: 14  R   9  
 0          NaN  R  12  
 1          NaN  R  16  
 2         53.0  R  28  
 3         68.0  R  18  
 4         41.0  R  18  
 ..         ... ..  ..  
 66         NaN  R  20  
 67        42.0  R   6  
 68         NaN  R  15  
 69         NaN  R   6  
 70         NaN  R  11  
 
 [71 rows x 33 columns],
     235583  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     46  Z  ...  \
 0   235671  R       184.0   R      1025.0   R        18.0  R  163.0  Z  ...   
 1   235699  R       163.0   R      1021.0   R        16.0  R    0.0  Z  ...   
 2   235750  R       259.0   R       702.0   R        37.0  R    0.0  Z  ...   
 3   236018  A         NaN   A         NaN   A         NaN  R    0.0  R  ...   
 4   236072  R       300.0   R      1668.0   R        18.0  R    0.0  Z  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ... ..  ...   
 66  237899  R       273.0   R      1550.0   R        18.0  R  269.0  R  ...   
 67  237905  A         NaN   A         NaN   A         NaN  A    NaN  A  ...   
 68  237932  R       412.0   R       693.0   R        59.0  R  464.0  R  ...   
 69  237950  R       279.0   R       685.0   R        41.0  R  318.0  R  ...   
 70  237969  R       258.0   R       276.0   R        93.0  R  326.0  R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.4  15  
 0     Z        0.0    R       16.0    R       10.0    R       63.0    R  16  
 1     Z        0.0    R        0.0    R        0.0    A        NaN    R  12  
 2     Z        0.0    R        0.0    R        0.0    A        NaN    R  14  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R   7  
 4     Z        0.0    R        0.0    R        0.0    A        NaN    R  17  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    R        0.0    R        4.0    R        1.0    R       25.0    R  12  
 67    A        NaN    A        NaN    A        NaN    A        NaN    R   6  
 68    R        0.0    R        1.0    R        0.0    R        0.0    R  11  
 69    R        0.0    R        3.0    R        1.0    R       33.0    R  13  
 70    R        0.0    R        0.0    R        0.0    A        NaN    R  12  
 
 [71 rows x 33 columns],
     237987  R      47 R.1     103 R.2    46 R.3    49 R.4  ...  A.2  \
 0   237996  R   101.0   R   278.0   R  36.0   R    69   R  ...    A   
 1   238005  A     NaN   A     NaN   A   NaN   R    44   R  ...    A   
 2   238014  R   233.0   R   556.0   R  42.0   R   248   R  ...    R   
 3   238032  R  4479.0   R  5882.0   R  76.0   R  4940   R  ...    R   
 4   238078  R   188.0   R   290.0   R  65.0   R    75   R  ...    R   
 ..     ... ..     ...  ..     ...  ..   ...  ..   ...  ..  ...  ...   
 66  240693  R   355.0   R  1123.0   R  32.0   R   387   R  ...    R   
 67  240709  A     NaN   A     NaN   A   NaN   R    26   R  ...    R   
 68  240718  A     NaN   A     NaN   A   NaN   R   251   R  ...    A   
 69  240727  R  1401.0   R  2363.0   R  59.0   R  1752   R  ...    R   
 70  240736  R   281.0   R   374.0   R  75.0   R     0   R  ...    R   
 
    Unnamed: 2  A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9  22  
 0         NaN    A        NaN    A        NaN    A        NaN    R  26  
 1         NaN    A        NaN    A        NaN    A        NaN    R  20  
 2         0.0    R       36.0    R       12.0    R       33.0    R  11  
 3         0.0    R        9.0    R        1.0    R       11.0    R  21  
 4         0.0    R       46.0    R       29.0    R       63.0    R  12  
 ..        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66        0.0    R       52.0    R       13.0    R       25.0    R  11  
 67        0.0    R       18.0    R       11.0    R       61.0    R  10  
 68        NaN    A        NaN    A        NaN    A        NaN    R  15  
 69        0.0    R        8.0    R        0.0    R        0.0    R  14  
 70        0.0    R        0.0    R        0.0    A        NaN    R  14  
 
 [71 rows x 33 columns],
     240745  R     339 R.1      779 R.2    44 R.3       0 R.4  ...  R.10  0.7  \
 0   240754  R   409.0   R    552.0   R  74.0   R   384.0   R  ...     R  0.0   
 1   240790  R   252.0   R    302.0   R  83.0   R    60.0   R  ...     R  0.0   
 2   240879  A     NaN   A      NaN   A   NaN   R   155.0   R  ...     A  NaN   
 3   240985  A     NaN   A      NaN   A   NaN   R    22.0   R  ...     A  NaN   
 4   241100  R    37.0   R     49.0   R  76.0   R    28.0   R  ...     R  0.0   
 ..     ... ..     ...  ..      ...  ..   ...  ..     ...  ..  ...   ...  ...   
 66  243647  R   123.0   R    176.0   R  70.0   R   111.0   R  ...     R  0.0   
 67  243665  R   259.0   R    381.0   R  68.0   R   322.0   R  ...     R  0.0   
 68  243744  R  1606.0   R   1673.0   R  96.0   R  1698.0   R  ...     Z  0.0   
 69  243780  R  8887.0   R  10114.0   R  88.0   R  8132.0   R  ...     Z  0.0   
 70  243799  A     NaN   A      NaN   A   NaN   R     0.0   R  ...     A  NaN   
 
     R.11   0.8  R.12   0.9  A.1 Unnamed: 1  R.13  15  
 0      R  32.0     R  17.0    R       53.0     R  17  
 1      R   4.0     R   1.0    R       25.0     R  24  
 2      A   NaN     A   NaN    A        NaN     R  29  
 3      A   NaN     A   NaN    A        NaN     R  20  
 4      R   1.0     R   0.0    R        0.0     R  17  
 ..   ...   ...   ...   ...  ...        ...   ...  ..  
 66     R  15.0     R  11.0    R       73.0     R  10  
 67     R  40.0     R  13.0    R       33.0     R  12  
 68     R   0.0     R   0.0    A        NaN     R   4  
 69     R  30.0     R  21.0    R       70.0     R  13  
 70     A   NaN     A   NaN    A        NaN     R   8  
 
 [71 rows x 33 columns],
     243823  R      35 R.1     142 R.2    25 R.3      36 R.4  ...  Z.1  0.3  \
 0   243832  R    94.0   R   418.0   R  22.0   R    26.0   R  ...    R  0.0   
 1   243841  R    57.0   R   152.0   R  38.0   R   254.0   R  ...    R  0.0   
 2   244233  R    14.0   R   121.0   R  12.0   R     0.0   R  ...    R  0.0   
 3   244279  A     NaN   A     NaN   A   NaN   R    25.0   R  ...    A  NaN   
 4   244437  R  2296.0   R  6143.0   R  37.0   R  2965.0   R  ...    R  0.0   
 ..     ... ..     ...  ..     ...  ..   ...  ..     ...  ..  ...  ...  ...   
 66  260813  R     7.0   R    26.0   R  27.0   R     1.0   Z  ...    A  NaN   
 67  260929  A     NaN   A     NaN   A   NaN   R     1.0   Z  ...    Z  0.0   
 68  260965  A     NaN   A     NaN   A   NaN   R     9.0   Z  ...    A  NaN   
 69  260992  R    37.0   R   108.0   R  34.0   R     0.0   R  ...    R  0.0   
 70  261436  A     NaN   A     NaN   A   NaN   R   289.0   Z  ...    A  NaN   
 
     R.10    5.1  R.11      3  R.12    60  R.13  10  
 0      R    2.0     R    1.0     R  50.0     R  20  
 1      R    7.0     R    3.0     R  43.0     R  17  
 2      R    0.0     R    0.0     A   NaN     R   9  
 3      A    NaN     A    NaN     A   NaN     R  18  
 4      R  967.0     R  509.0     R  53.0     R  20  
 ..   ...    ...   ...    ...   ...   ...   ...  ..  
 66     A    NaN     A    NaN     A   NaN     R  11  
 67     R    0.0     R    0.0     A   NaN     R  15  
 68     A    NaN     A    NaN     A   NaN     R   8  
 69     R    0.0     R    0.0     A   NaN     R  14  
 70     A    NaN     A    NaN     A   NaN     R  14  
 
 [71 rows x 33 columns],
     261676  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      25 R.1  ...  \
 0   261685  A         NaN   A         NaN   A         NaN  R    21.0   R  ...   
 1   261719  A         NaN   A         NaN   A         NaN  R    43.0   R  ...   
 2   261773  A         NaN   A         NaN   A         NaN  R     4.0   Z  ...   
 3   261861  R        32.0   R        52.0   R        62.0  R    29.0   R  ...   
 4   262031  R       991.0   R      1808.0   R        55.0  R   978.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  366632  A         NaN   A         NaN   A         NaN  R    17.0   R  ...   
 67  366711  R      2117.0   R      4666.0   R        45.0  R  2131.0   Z  ...   
 68  367051  A         NaN   A         NaN   A         NaN  R    22.0   R  ...   
 69  367088  A         NaN   A         NaN   A         NaN  R     9.0   R  ...   
 70  367103  A         NaN   A         NaN   A         NaN  R    19.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6   9  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 1     A        NaN    A        NaN    A        NaN    A        NaN    R  18  
 2     R        0.0    R        1.0    R        0.0    R        0.0    R   3  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R  15  
 4     R        0.0    R      312.0    R      124.0    R       40.0    R  20  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 67    Z        0.0    R      114.0    R       64.0    R       56.0    R  27  
 68    R        0.0    R       25.0    R       23.0    R       92.0    R  15  
 69    R        0.0    R        5.0    R        2.0    R       40.0    R  29  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  12  
 
 [71 rows x 33 columns],
     367112  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      70 R.1  ...  \
 0   367158  A         NaN   A         NaN   A         NaN  R    38.0   R  ...   
 1   367334  A         NaN   A         NaN   A         NaN  A     NaN   A  ...   
 2   367361  A         NaN   A         NaN   A         NaN  R    10.0   R  ...   
 3   367431  A         NaN   A         NaN   A         NaN  R    11.0   R  ...   
 4   367459  R       759.0   R      3869.0   R        20.0  R  1033.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  375656  A         NaN   A         NaN   A         NaN  R    30.0   R  ...   
 67  375683  A         NaN   A         NaN   A         NaN  R    67.0   Z  ...   
 68  375726  A         NaN   A         NaN   A         NaN  R   110.0   R  ...   
 69  375939  A         NaN   A         NaN   A         NaN  R    10.0   Z  ...   
 70  375966  A         NaN   A         NaN   A         NaN  R     2.0   R  ...   
 
     R.8  0.3  R.9    2.1  R.10    0.4  R.11   0.5  R.12  43  
 0     A  NaN    A    NaN     A    NaN     A   NaN     R  30  
 1     A  NaN    A    NaN     A    NaN     A   NaN     R  10  
 2     A  NaN    A    NaN     A    NaN     A   NaN     R  12  
 3     A  NaN    A    NaN     A    NaN     A   NaN     R  10  
 4     R  0.0    R  735.0     R  260.0     R  35.0     R  16  
 ..  ...  ...  ...    ...   ...    ...   ...   ...   ...  ..  
 66    R  0.0    R   17.0     R   13.0     R  76.0     R   5  
 67    Z  0.0    R   34.0     R   31.0     R  91.0     R  16  
 68    R  0.0    R   56.0     R   21.0     R  38.0     R   4  
 69    A  NaN    A    NaN     A    NaN     A   NaN     R  10  
 70    R  0.0    R    0.0     R    0.0     A   NaN     R   8  
 
 [71 rows x 33 columns],
     375984  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     95 R.1  ...  \
 0   376224  R        32.0   R        48.0   R        67.0  R   11.0   R  ...   
 1   376242  A         NaN   A         NaN   A         NaN  R   57.0   R  ...   
 2   376288  A         NaN   A         NaN   A         NaN  R   32.0   R  ...   
 3   376321  R       129.0   R       409.0   R        32.0  R   10.0   R  ...   
 4   376330  A         NaN   A         NaN   A         NaN  R   23.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  383765  A         NaN   A         NaN   A         NaN  R   60.0   R  ...   
 67  383996  R        39.0   R       211.0   R        18.0  R   54.0   R  ...   
 68  384236  A         NaN   A         NaN   A         NaN  R  103.0   R  ...   
 69  384245  A         NaN   A         NaN   A         NaN  R   90.0   R  ...   
 70  384254  R        81.0   R       103.0   R        79.0  R   98.0   R  ...   
 
     Z.1  0.3  R.7  25.1  R.8    10  R.9    40  R.10  24  
 0     A  NaN    A   NaN    A   NaN    A   NaN     R  18  
 1     A  NaN    A   NaN    A   NaN    A   NaN     R  25  
 2     A  NaN    A   NaN    A   NaN    A   NaN     R  45  
 3     R  0.0    R   5.0    R   1.0    R  20.0     R  10  
 4     A  NaN    A   NaN    A   NaN    A   NaN     R  19  
 ..  ...  ...  ...   ...  ...   ...  ...   ...   ...  ..  
 66    Z  0.0    R  21.0    R  15.0    R  71.0     R  25  
 67    R  0.0    R  20.0    R   8.0    R  40.0     R  15  
 68    A  NaN    A   NaN    A   NaN    A   NaN     R  25  
 69    A  NaN    A   NaN    A   NaN    A   NaN     R  25  
 70    R  0.0    R   0.0    R   0.0    A   NaN     R  11  
 
 [71 rows x 33 columns],
     384333  R    582 R.1    1766 R.2    33 R.3    813 R.4  ...  Z.1  0.3  \
 0   384342  R  519.0   R  2007.0   R  26.0   R  557.0   R  ...    R  0.0   
 1   384412  R   14.0   R    20.0   R  70.0   R    5.0   R  ...    A  NaN   
 2   384421  R   20.0   R    23.0   R  87.0   R   23.0   R  ...    A  NaN   
 3   385132  A    NaN   A     NaN   A   NaN   R   18.0   R  ...    R  0.0   
 4   385503  A    NaN   A     NaN   A   NaN   R   57.0   Z  ...    A  NaN   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ...  ...  ...   
 66  407179  A    NaN   A     NaN   A   NaN   R   17.0   R  ...    Z  0.0   
 67  407355  A    NaN   A     NaN   A   NaN   R   14.0   R  ...    A  NaN   
 68  407391  A    NaN   A     NaN   A   NaN   R   24.0   R  ...    R  0.0   
 69  407407  A    NaN   A     NaN   A   NaN   R    2.0   R  ...    R  0.0   
 70  407425  A    NaN   A     NaN   A   NaN   R    6.0   Z  ...    Z  0.0   
 
     R.10  544.1  R.11   228  R.12     42  R.13    19  
 0      R  184.0     R  82.0     R   45.0     R  18.0  
 1      A    NaN     A   NaN     A    NaN     R  14.0  
 2      A    NaN     A   NaN     A    NaN     R  15.0  
 3      R    5.0     R   3.0     R   60.0     R  14.0  
 4      A    NaN     A   NaN     A    NaN     R  21.0  
 ..   ...    ...   ...   ...   ...    ...   ...   ...  
 66     R    6.0     R   5.0     R   83.0     R   9.0  
 67     A    NaN     A   NaN     A    NaN     R  11.0  
 68     R   18.0     R  18.0     R  100.0     R   9.0  
 69     R   22.0     R  22.0     R  100.0     R  13.0  
 70     R    2.0     R   2.0     R  100.0     R  14.0  
 
 [71 rows x 33 columns],
     407434  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R    16 R.1  ...  \
 0   407461  A         NaN   A         NaN   A         NaN  R  39.0   R  ...   
 1   407470  A         NaN   A         NaN   A         NaN  R  74.0   R  ...   
 2   407489  A         NaN   A         NaN   A         NaN  R  35.0   R  ...   
 3   407513  A         NaN   A         NaN   A         NaN  R  30.0   R  ...   
 4   407522  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ...  ..  ...   
 66  417275  A         NaN   A         NaN   A         NaN  R  24.0   R  ...   
 67  417327  R        12.0   R        46.0   R        26.0  R   7.0   R  ...   
 68  417442  A         NaN   A         NaN   A         NaN  R  17.0   R  ...   
 69  417503  A         NaN   A         NaN   A         NaN  R   8.0   R  ...   
 70  417600  A         NaN   A         NaN   A         NaN  R  10.0   R  ...   
 
     R.8  0.3  R.9  13.1  R.10    10  R.11    77  R.12 10.1  
 0     A  NaN    A   NaN     A   NaN     A   NaN     R    4  
 1     Z  0.0    R  25.0     R   9.0     R  36.0     R   19  
 2     R  0.0    R  25.0     R  24.0     R  96.0     R   10  
 3     R  0.0    R  15.0     R  12.0     R  80.0     R   11  
 4     A  NaN    A   NaN     A   NaN     A   NaN     R    7  
 ..  ...  ...  ...   ...   ...   ...   ...   ...   ...  ...  
 66    A  NaN    A   NaN     A   NaN     A   NaN     R   18  
 67    R  0.0    R   4.0     R   2.0     R  50.0     R   10  
 68    A  NaN    A   NaN     A   NaN     A   NaN     R   14  
 69    R  0.0    R   2.0     R   0.0     R   0.0     R    5  
 70    A  NaN    A   NaN     A   NaN     A   NaN     R    7  
 
 [71 rows x 33 columns],
     417628  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      7 R.1  ...  \
 0   417637  A         NaN   A         NaN   A         NaN  R   25.0   R  ...   
 1   417646  A         NaN   A         NaN   A         NaN  R   10.0   Z  ...   
 2   417655  A         NaN   A         NaN   A         NaN  R   21.0   R  ...   
 3   417682  A         NaN   A         NaN   A         NaN  R    1.0   R  ...   
 4   417716  A         NaN   A         NaN   A         NaN  R    8.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  419484  A         NaN   A         NaN   A         NaN  R   11.0   R  ...   
 67  419633  A         NaN   A         NaN   A         NaN  R   27.0   R  ...   
 68  419703  A         NaN   A         NaN   A         NaN  R  315.0   Z  ...   
 69  419712  A         NaN   A         NaN   A         NaN  R   98.0   Z  ...   
 70  419721  A         NaN   A         NaN   A         NaN  R  147.0   Z  ...   
 
     R.8 0.3  R.9 4.1  R.10 4.2  R.11 100  R.12   3  
 0     A NaN    A NaN     A NaN     A NaN     R   7  
 1     A NaN    A NaN     A NaN     A NaN     R   8  
 2     A NaN    A NaN     A NaN     A NaN     R   5  
 3     A NaN    A NaN     A NaN     A NaN     R  10  
 4     A NaN    A NaN     A NaN     A NaN     R  13  
 ..  ...  ..  ...  ..   ...  ..   ...  ..   ...  ..  
 66    A NaN    A NaN     A NaN     A NaN     R  14  
 67    A NaN    A NaN     A NaN     A NaN     R  10  
 68    A NaN    A NaN     A NaN     A NaN     R  26  
 69    A NaN    A NaN     A NaN     A NaN     R  13  
 70    A NaN    A NaN     A NaN     A NaN     R  10  
 
 [71 rows x 33 columns],
     420024  R    20 R.1  20.1 R.2   100 R.3      0  Z  ...  A.3 Unnamed: 3  \
 0   420042  A   NaN   A   NaN   A   NaN   R    2.0  R  ...    A        NaN   
 1   420130  A   NaN   A   NaN   A   NaN   R   58.0  Z  ...    A        NaN   
 2   420255  A   NaN   A   NaN   A   NaN   R   19.0  R  ...    A        NaN   
 3   420325  R  25.0   R  31.0   R  81.0   R   24.0  R  ...    A        NaN   
 4   420343  A   NaN   A   NaN   A   NaN   R  222.0  R  ...    R        0.0   
 ..     ... ..   ...  ..   ...  ..   ...  ..    ... ..  ...  ...        ...   
 66  431071  A   NaN   A   NaN   A   NaN   R    8.0  R  ...    R        0.0   
 67  431099  A   NaN   A   NaN   A   NaN   R   30.0  R  ...    R        0.0   
 68  431105  A   NaN   A   NaN   A   NaN   R    6.0  R  ...    Z        0.0   
 69  431123  A   NaN   A   NaN   A   NaN   R   81.0  R  ...    A        NaN   
 70  431141  A   NaN   A   NaN   A   NaN   R  223.0  R  ...    A        NaN   
 
     A.4 Unnamed: 4  A.5 Unnamed: 5  A.6 Unnamed: 6  R.5  14  
 0     A        NaN    A        NaN    A        NaN    R   8  
 1     A        NaN    A        NaN    A        NaN    R  30  
 2     A        NaN    A        NaN    A        NaN    R  14  
 3     A        NaN    A        NaN    A        NaN    R  14  
 4     R       43.0    R       35.0    R       81.0    R  26  
 ..  ...        ...  ...        ...  ...        ...  ...  ..  
 66    R        2.0    R        1.0    R       50.0    R  12  
 67    R       22.0    R       13.0    R       59.0    R   9  
 68    R       19.0    R       10.0    R       53.0    R   6  
 69    A        NaN    A        NaN    A        NaN    R  14  
 70    A        NaN    A        NaN    A        NaN    R  13  
 
 [71 rows x 33 columns],
     431169  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     16 R.1  ...  \
 0   431187  A         NaN   A         NaN   A         NaN  R   77.0   R  ...   
 1   431196  A         NaN   A         NaN   A         NaN  R   10.0   R  ...   
 2   431266  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 3   431275  A         NaN   A         NaN   A         NaN  R    3.0   R  ...   
 4   431284  A         NaN   A         NaN   A         NaN  R    4.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  437316  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 67  437325  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 68  437556  A         NaN   A         NaN   A         NaN  R  414.0   R  ...   
 69  437608  A         NaN   A         NaN   A         NaN  R   42.0   R  ...   
 70  437635  A         NaN   A         NaN   A         NaN  R  137.0   Z  ...   
 
     R.8  0.4  R.9   0.5  R.10   0.6  A.3 Unnamed: 3  R.11   7  
 0     A  NaN    A   NaN     A   NaN    A        NaN     R   7  
 1     A  NaN    A   NaN     A   NaN    A        NaN     R  24  
 2     R  0.0    R  44.0     R  24.0    R       55.0     R  10  
 3     R  0.0    R   6.0     R   4.0    R       67.0     R   7  
 4     R  0.0    R   3.0     R   2.0    R       67.0     R  25  
 ..  ...  ...  ...   ...   ...   ...  ...        ...   ...  ..  
 66    A  NaN    A   NaN     A   NaN    A        NaN     R  14  
 67    A  NaN    A   NaN     A   NaN    A        NaN     R   6  
 68    R  0.0    R  14.0     R  14.0    R      100.0     R  18  
 69    R  0.0    R   9.0     R   7.0    R       78.0     R  20  
 70    A  NaN    A   NaN     A   NaN    A        NaN     R  19  
 
 [71 rows x 33 columns],
     437705  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R       8 R.1  ...  \
 0   437723  A         NaN   A         NaN   A         NaN  R    78.0   R  ...   
 1   437732  A         NaN   A         NaN   A         NaN  R     0.0   R  ...   
 2   437750  R        40.0   R        40.0   R       100.0  R    46.0   R  ...   
 3   437769  A         NaN   A         NaN   A         NaN  R    30.0   R  ...   
 4   437778  A         NaN   A         NaN   A         NaN  R   197.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..     ...  ..  ...   
 66  441168  A         NaN   A         NaN   A         NaN  R    13.0   Z  ...   
 67  441210  A         NaN   A         NaN   A         NaN  R     0.0   R  ...   
 68  441229  R       180.0   R       268.0   R        67.0  R   157.0   R  ...   
 69  441256  A         NaN   A         NaN   A         NaN  R    13.0   R  ...   
 70  441371  A         NaN   A         NaN   A         NaN  R  4693.0   R  ...   
 
     R.8  0.4  R.9   0.5  R.10   0.6  A.3 Unnamed: 3  R.11   3  
 0     A  NaN    A   NaN     A   NaN    A        NaN     R  15  
 1     A  NaN    A   NaN     A   NaN    A        NaN     R   8  
 2     A  NaN    A   NaN     A   NaN    A        NaN     R  17  
 3     A  NaN    A   NaN     A   NaN    A        NaN     R  20  
 4     A  NaN    A   NaN     A   NaN    A        NaN     R  25  
 ..  ...  ...  ...   ...   ...   ...  ...        ...   ...  ..  
 66    A  NaN    A   NaN     A   NaN    A        NaN     R  10  
 67    R  0.0    R  61.0     R  11.0    R       18.0     R  15  
 68    A  NaN    A   NaN     A   NaN    A        NaN     R   7  
 69    A  NaN    A   NaN     A   NaN    A        NaN     R   4  
 70    A  NaN    A   NaN     A   NaN    A        NaN     R  16  
 
 [71 rows x 33 columns],
     441380  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     21 R.1  ...  \
 0   441414  A         NaN   A         NaN   A         NaN  R   17.0   R  ...   
 1   441423  A         NaN   A         NaN   A         NaN  R  206.0   R  ...   
 2   441487  R         7.0   R        35.0   R        20.0  R    8.0   R  ...   
 3   441496  A         NaN   A         NaN   A         NaN  R   29.0   Z  ...   
 4   441502  A         NaN   A         NaN   A         NaN  R  101.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  443340  R         4.0   R         9.0   R        44.0  R    5.0   R  ...   
 67  443377  R        13.0   R        16.0   R        81.0  R   11.0   R  ...   
 68  443410  R       159.0   R       286.0   R        56.0  R  185.0   R  ...   
 69  443492  R       308.0   R       507.0   R        61.0  R  325.0   R  ...   
 70  443562  R        38.0   R        70.0   R        54.0  R   54.0   R  ...   
 
     R.8  0.3  R.9  11.1  R.10   0.4  R.11    0.5  R.12  25  
 0     R  0.0    R  69.0     R  54.0     R   78.0     R  20  
 1     A  NaN    A   NaN     A   NaN     A    NaN     R  22  
 2     R  0.0    R   1.0     R   0.0     R    0.0     R  10  
 3     A  NaN    A   NaN     A   NaN     A    NaN     R  10  
 4     A  NaN    A   NaN     A   NaN     A    NaN     R  18  
 ..  ...  ...  ...   ...   ...   ...   ...    ...   ...  ..  
 66    R  0.0    R   1.0     R   1.0     R  100.0     R   7  
 67    A  NaN    A   NaN     A   NaN     A    NaN     R   6  
 68    R  0.0    R   1.0     R   1.0     R  100.0     R  10  
 69    Z  0.0    R  40.0     R  17.0     R   43.0     R  14  
 70    R  0.0    R  16.0     R  10.0     R   63.0     R  15  
 
 [71 rows x 33 columns],
     443571  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     55 R.1  ...  \
 0   443599  R         3.0   R       233.0   R         1.0  R    1.0   R  ...   
 1   443632  A         NaN   A         NaN   A         NaN  R   23.0   Z  ...   
 2   443641  A         NaN   A         NaN   A         NaN  R    9.0   Z  ...   
 3   443650  A         NaN   A         NaN   A         NaN  R  116.0   R  ...   
 4   443669  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  445498  A         NaN   A         NaN   A         NaN  R   25.0   R  ...   
 67  445540  A         NaN   A         NaN   A         NaN  R   18.0   R  ...   
 68  445638  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 69  445647  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 70  445656  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.5  15  
 0     R        0.0    R        1.0    R        0.0    R        0.0    R  19  
 1     A        NaN    A        NaN    A        NaN    A        NaN    R  13  
 2     Z        0.0    R        9.0    R        8.0    R       89.0    R  15  
 3     R        0.0    R        3.0    R        2.0    R       67.0    R  31  
 4     A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    A        NaN    R  11  
 67    A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 68    A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 69    A        NaN    A        NaN    A        NaN    A        NaN    R  14  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  18  
 
 [71 rows x 33 columns],
     445692  R    205 R.1    227 R.2    90 R.3    183 R.4  ...  R.11  0.4  \
 0   445708  R  473.0   R  519.0   R  91.0   R  330.0   R  ...     R  0.0   
 1   445735  A    NaN   A    NaN   A   NaN   A    NaN   A  ...     A  NaN   
 2   445744  A    NaN   A    NaN   A   NaN   R   42.0   Z  ...     A  NaN   
 3   445762  A    NaN   A    NaN   A   NaN   R   13.0   R  ...     A  NaN   
 4   445780  A    NaN   A    NaN   A   NaN   R    2.0   Z  ...     Z  0.0   
 ..     ... ..    ...  ..    ...  ..   ...  ..    ...  ..  ...   ...  ...   
 66  447795  A    NaN   A    NaN   A   NaN   R   32.0   R  ...     A  NaN   
 67  447810  A    NaN   A    NaN   A   NaN   R  110.0   R  ...     A  NaN   
 68  447847  R   30.0   R   39.0   R  77.0   R   73.0   Z  ...     A  NaN   
 69  447865  A    NaN   A    NaN   A   NaN   R   13.0   R  ...     R  0.0   
 70  447874  A    NaN   A    NaN   A   NaN   R   27.0   R  ...     A  NaN   
 
     R.12  0.5  R.13  0.6  A Unnamed: 0  R.14  30  
 0      R  0.0     R  0.0  A        NaN     R  15  
 1      A  NaN     A  NaN  A        NaN     R  10  
 2      A  NaN     A  NaN  A        NaN     R  24  
 3      A  NaN     A  NaN  A        NaN     R  11  
 4      R  1.0     R  1.0  R      100.0     R   4  
 ..   ...  ...   ...  ... ..        ...   ...  ..  
 66     A  NaN     A  NaN  A        NaN     R  10  
 67     A  NaN     A  NaN  A        NaN     R  10  
 68     A  NaN     A  NaN  A        NaN     R   5  
 69     R  9.0     R  7.0  R       78.0     R  10  
 70     A  NaN     A  NaN  A        NaN     R  14  
 
 [71 rows x 33 columns],
     447883  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R    44 R.1  ...  \
 0   447892  A         NaN   A         NaN   A         NaN  R  32.0   R  ...   
 1   447908  A         NaN   A         NaN   A         NaN  R   4.0   Z  ...   
 2   447917  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 3   447935  R        47.0   R        96.0   R        49.0  R  63.0   R  ...   
 4   447953  R        34.0   R       108.0   R        31.0  R  70.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ...  ..  ...   
 66  449773  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 67  449782  A         NaN   A         NaN   A         NaN  R  16.0   R  ...   
 68  449807  A         NaN   A         NaN   A         NaN  R  27.0   R  ...   
 69  449816  A         NaN   A         NaN   A         NaN  R  29.0   R  ...   
 70  449861  A         NaN   A         NaN   A         NaN  R  96.0   R  ...   
 
     R.8  0.3  R.9  47.1  R.10    25  R.11     53  R.12   7  
 0     Z  0.0    R   9.0     R   5.0     R   56.0     R  17  
 1     Z  0.0    R   1.0     R   1.0     R  100.0     R  20  
 2     A  NaN    A   NaN     A   NaN     A    NaN     R  11  
 3     A  NaN    A   NaN     A   NaN     A    NaN     R  15  
 4     R  0.0    R   4.0     R   2.0     R   50.0     R  10  
 ..  ...  ...  ...   ...   ...   ...   ...    ...   ...  ..  
 66    A  NaN    A   NaN     A   NaN     A    NaN     R  12  
 67    R  0.0    R  24.0     R  17.0     R   71.0     R  12  
 68    A  NaN    A   NaN     A   NaN     A    NaN     R  21  
 69    R  0.0    R  59.0     R  39.0     R   66.0     R   8  
 70    A  NaN    A   NaN     A   NaN     A    NaN     R  19  
 
 [71 rows x 33 columns],
     449870  R     7 R.1    22 R.2    32 R.3      8 R.4  ...  A.2 Unnamed: 2  \
 0   449889  A   NaN   A   NaN   A   NaN   A    NaN   A  ...    A        NaN   
 1   449898  R  12.0   R  98.0   R  12.0   R    6.0   R  ...    R        0.0   
 2   449904  A   NaN   A   NaN   A   NaN   R  113.0   R  ...    A        NaN   
 3   449931  R   1.0   R  44.0   R   2.0   R    4.0   R  ...    R        0.0   
 4   449959  A   NaN   A   NaN   A   NaN   R   30.0   Z  ...    Z        0.0   
 ..     ... ..   ...  ..   ...  ..   ...  ..    ...  ..  ...  ...        ...   
 66  451574  A   NaN   A   NaN   A   NaN   R   35.0   R  ...    A        NaN   
 67  451583  A   NaN   A   NaN   A   NaN   R   52.0   R  ...    R        0.0   
 68  451626  A   NaN   A   NaN   A   NaN   R   46.0   R  ...    R        0.0   
 69  451714  A   NaN   A   NaN   A   NaN   R   42.0   Z  ...    Z        0.0   
 70  451741  R  40.0   R  62.0   R  65.0   R    0.0   R  ...    A        NaN   
 
     A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9   5  
 0     A        NaN    A        NaN    A        NaN    R  10  
 1     R        1.0    R        0.0    R        0.0    R  13  
 2     A        NaN    A        NaN    A        NaN    R  17  
 3     R        5.0    R        2.0    R       40.0    R   9  
 4     R       20.0    R       18.0    R       90.0    R  15  
 ..  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    R  10  
 67    R      105.0    R       75.0    R       71.0    R   8  
 68    R        5.0    R        0.0    R        0.0    R   7  
 69    R        0.0    R        0.0    A        NaN    R  18  
 70    A        NaN    A        NaN    A        NaN    R  15  
 
 [71 rows x 33 columns],
     451750  R   170 R.1    539 R.2    32 R.3    39 R.4  ...  R.11  0.4  R.12  \
 0   451820  R  26.0   R   52.0   R  50.0   R  30.0   Z  ...     R  0.0     R   
 1   451857  A   NaN   A    NaN   A   NaN   R  57.0   R  ...     A  NaN     A   
 2   451866  A   NaN   A    NaN   A   NaN   R   0.0   R  ...     A  NaN     A   
 3   451918  A   NaN   A    NaN   A   NaN   R  15.0   R  ...     A  NaN     A   
 4   451927  R  76.0   R  113.0   R  67.0   R  65.0   R  ...     R  0.0     R   
 ..     ... ..   ...  ..    ...  ..   ...  ..   ...  ..  ...   ...  ...   ...   
 66  455196  A   NaN   A    NaN   A   NaN   R  94.0   Z  ...     A  NaN     A   
 67  455202  A   NaN   A    NaN   A   NaN   R  54.0   Z  ...     Z  0.0     R   
 68  455211  A   NaN   A    NaN   A   NaN   R  61.0   R  ...     A  NaN     A   
 69  455220  A   NaN   A    NaN   A   NaN   R  45.0   R  ...     A  NaN     A   
 70  455239  A   NaN   A    NaN   A   NaN   R   8.0   R  ...     R  0.0     R   
 
      0.5  R.13  0.6  A Unnamed: 0  R.14  18  
 0   14.0     R  4.0  R       29.0     R  23  
 1    NaN     A  NaN  A        NaN     R   7  
 2    NaN     A  NaN  A        NaN     R  11  
 3    NaN     A  NaN  A        NaN     R  12  
 4   11.0     R  8.0  R       73.0     R  12  
 ..   ...   ...  ... ..        ...   ...  ..  
 66   NaN     A  NaN  A        NaN     R  26  
 67   2.0     R  1.0  R       50.0     R  15  
 68   NaN     A  NaN  A        NaN     R   9  
 69   NaN     A  NaN  A        NaN     R  12  
 70   3.0     R  3.0  R      100.0     R   8  
 
 [71 rows x 33 columns],
     455257  R   60 R.1    76 R.2    79 R.3    90 R.4  ...  A.2 Unnamed: 2  \
 0   455275  A  NaN   A   NaN   A   NaN   R  80.0   R  ...    A        NaN   
 1   455284  A  NaN   A   NaN   A   NaN   R  33.0   Z  ...    Z        0.0   
 2   455327  A  NaN   A   NaN   A   NaN   A   NaN   A  ...    A        NaN   
 3   455336  A  NaN   A   NaN   A   NaN   R  70.0   R  ...    A        NaN   
 4   455354  A  NaN   A   NaN   A   NaN   R  55.0   R  ...    A        NaN   
 ..     ... ..  ...  ..   ...  ..   ...  ..   ...  ..  ...  ...        ...   
 66  457192  A  NaN   A   NaN   A   NaN   R  11.0   R  ...    A        NaN   
 67  457208  A  NaN   A   NaN   A   NaN   R  23.0   R  ...    A        NaN   
 68  457226  R  4.0   R  11.0   R  36.0   R   0.0   R  ...    R        0.0   
 69  457253  A  NaN   A   NaN   A   NaN   R   5.0   R  ...    R        0.0   
 70  457299  A  NaN   A   NaN   A   NaN   R   0.0   R  ...    R        0.0   
 
     A.3 Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9  27  
 0     A        NaN    A        NaN    A        NaN    R  15  
 1     R       14.0    R       12.0    R       86.0    R  14  
 2     A        NaN    A        NaN    A        NaN    R   7  
 3     A        NaN    A        NaN    A        NaN    R  20  
 4     A        NaN    A        NaN    A        NaN    R  15  
 ..  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    R  10  
 67    A        NaN    A        NaN    A        NaN    R  15  
 68    R        2.0    R        1.0    R       50.0    R   5  
 69    R        9.0    R        7.0    R       78.0    R  15  
 70    R        0.0    R        0.0    A        NaN    R  25  
 
 [71 rows x 33 columns],
     457314  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R    54 R.1  ...  \
 0   457323  A         NaN   A         NaN   A         NaN  R  72.0   R  ...   
 1   457332  A         NaN   A         NaN   A         NaN  R   8.0   R  ...   
 2   457341  A         NaN   A         NaN   A         NaN  R  20.0   R  ...   
 3   457350  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 4   457378  A         NaN   A         NaN   A         NaN  R  37.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ...  ..  ...   
 66  458274  A         NaN   A         NaN   A         NaN  R  18.0   Z  ...   
 67  458380  A         NaN   A         NaN   A         NaN  R  26.0   R  ...   
 68  458405  A         NaN   A         NaN   A         NaN  R  94.0   R  ...   
 69  458441  A         NaN   A         NaN   A         NaN  R  79.0   R  ...   
 70  458496  R        17.0   R        58.0   R        29.0  R  15.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6  10  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R  22  
 1     R        0.0    R        0.0    R        0.0    A        NaN    R   9  
 2     R        0.0    R        0.0    R        0.0    A        NaN    R  13  
 3     A        NaN    A        NaN    A        NaN    A        NaN    P  15  
 4     R        0.0    R       50.0    R       27.0    R       54.0    R  14  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    Z        0.0    R        0.0    R        0.0    A        NaN    R  15  
 67    A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 68    A        NaN    A        NaN    A        NaN    A        NaN    R  20  
 69    R        0.0    R        1.0    R        1.0    R      100.0    R  27  
 70    R        0.0    R        8.0    R        4.0    R       50.0    R  20  
 
 [71 rows x 33 columns],
     458681  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     47  Z  ...  \
 0   458803  A         NaN   A         NaN   A         NaN  R   36.0  R  ...   
 1   458812  A         NaN   A         NaN   A         NaN  R   16.0  R  ...   
 2   458821  A         NaN   A         NaN   A         NaN  R   17.0  R  ...   
 3   458830  A         NaN   A         NaN   A         NaN  R   45.0  R  ...   
 4   458858  A         NaN   A         NaN   A         NaN  R   34.0  R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ... ..  ...   
 66  460190  A         NaN   A         NaN   A         NaN  R   61.0  Z  ...   
 67  460206  A         NaN   A         NaN   A         NaN  R  156.0  Z  ...   
 68  460349  R         6.0   R       317.0   R         2.0  R    2.0  R  ...   
 69  460376  A         NaN   A         NaN   A         NaN  R    4.0  R  ...   
 70  460385  A         NaN   A         NaN   A         NaN  R   32.0  R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.4  35  
 0     Z        0.0    R        9.0    R        6.0    R       67.0    R  13  
 1     Z        0.0    R        0.0    R        0.0    A        NaN    R  18  
 2     A        NaN    A        NaN    A        NaN    A        NaN    R  14  
 3     Z        0.0    R        5.0    R        2.0    R       40.0    R  21  
 4     Z        0.0    R       22.0    R        8.0    R       36.0    R  24  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    A        NaN    R  28  
 67    A        NaN    A        NaN    A        NaN    A        NaN    R  29  
 68    R        0.0    R       16.0    R        5.0    R       31.0    R  11  
 69    R        0.0    R        0.0    R        0.0    A        NaN    R   4  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  11  
 
 [71 rows x 33 columns],
     460394  R    514 R.1    3013 R.2    17 R.3    846 R.4  ...  R.11  0.3  \
 0   460455  A    NaN   A     NaN   A   NaN   R    2.0   R  ...     R  0.0   
 1   460464  R  593.0   R  3372.0   R  18.0   R  776.0   R  ...     R  0.0   
 2   460482  A    NaN   A     NaN   A   NaN   R   35.0   R  ...     A  NaN   
 3   460516  A    NaN   A     NaN   A   NaN   R  124.0   R  ...     A  NaN   
 4   460525  A    NaN   A     NaN   A   NaN   A    NaN   A  ...     A  NaN   
 ..     ... ..    ...  ..     ...  ..   ...  ..    ...  ..  ...   ...  ...   
 66  461722  A    NaN   A     NaN   A   NaN   R    5.0   R  ...     A  NaN   
 67  461740  A    NaN   A     NaN   A   NaN   R   13.0   R  ...     A  NaN   
 68  461759  R   14.0   R    22.0   R  64.0   R   24.0   R  ...     R  0.0   
 69  461768  A    NaN   A     NaN   A   NaN   A    NaN   A  ...     A  NaN   
 70  461786  A    NaN   A     NaN   A   NaN   R   24.0   R  ...     A  NaN   
 
     R.12  960.1  R.13    347  R.14    36  R.15  21  
 0      R   24.0     R   16.0     R  67.0     R   4  
 1      R  900.0     R  347.0     R  39.0     R  29  
 2      A    NaN     A    NaN     A   NaN     R  10  
 3      A    NaN     A    NaN     A   NaN     R  14  
 4      A    NaN     A    NaN     A   NaN     R  11  
 ..   ...    ...   ...    ...   ...   ...   ...  ..  
 66     A    NaN     A    NaN     A   NaN     R   7  
 67     A    NaN     A    NaN     A   NaN     R  23  
 68     R    0.0     R    0.0     A   NaN     R   4  
 69     A    NaN     A    NaN     A   NaN     R  41  
 70     A    NaN     A    NaN     A   NaN     R   5  
 
 [71 rows x 33 columns],
     461795  R  124 R.1   189 R.2    66 R.3     49 R.4  ...  R.11  0.3  R.12  \
 0   461810  A  NaN   A   NaN   A   NaN   R   15.0   R  ...     R  0.0     R   
 1   461829  A  NaN   A   NaN   A   NaN   R  127.0   R  ...     A  NaN     A   
 2   461838  A  NaN   A   NaN   A   NaN   A    NaN   A  ...     A  NaN     A   
 3   461847  R  8.0   R  26.0   R  31.0   R    7.0   R  ...     A  NaN     A   
 4   461856  A  NaN   A   NaN   A   NaN   R  195.0   R  ...     A  NaN     A   
 ..     ... ..  ...  ..   ...  ..   ...  ..    ...  ..  ...   ...  ...   ...   
 66  475273  A  NaN   A   NaN   A   NaN   A    NaN   A  ...     A  NaN     A   
 67  475282  A  NaN   A   NaN   A   NaN   R    8.0   R  ...     A  NaN     A   
 68  475325  A  NaN   A   NaN   A   NaN   R  222.0   R  ...     A  NaN     A   
 69  475398  A  NaN   A   NaN   A   NaN   R    0.0   R  ...     R  0.0     R   
 70  475404  A  NaN   A   NaN   A   NaN   A    NaN   A  ...     A  NaN     A   
 
     7.1  R.13  0.4  R.14    0.5  R.15  14  
 0   4.0     R  0.0     R    0.0     R  15  
 1   NaN     A  NaN     A    NaN     R  10  
 2   NaN     A  NaN     A    NaN     R  20  
 3   NaN     A  NaN     A    NaN     R  15  
 4   NaN     A  NaN     A    NaN     R  30  
 ..  ...   ...  ...   ...    ...   ...  ..  
 66  NaN     A  NaN     A    NaN     R   9  
 67  NaN     A  NaN     A    NaN     R  12  
 68  NaN     A  NaN     A    NaN     R  15  
 69  2.0     R  2.0     R  100.0     R   4  
 70  NaN     A  NaN     A    NaN     R  12  
 
 [71 rows x 33 columns],
     475413  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     16 R.1  ...  \
 0   475422  A         NaN   A         NaN   A         NaN  R    0.0   R  ...   
 1   475431  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 2   475459  A         NaN   A         NaN   A         NaN  R  491.0   R  ...   
 3   475468  A         NaN   A         NaN   A         NaN  R   16.0   R  ...   
 4   475477  R        15.0   R        35.0   R        43.0  R    6.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  478616  A         NaN   A         NaN   A         NaN  R   20.0   R  ...   
 67  478634  A         NaN   A         NaN   A         NaN  R    4.0   Z  ...   
 68  478661  A         NaN   A         NaN   A         NaN  R   17.0   R  ...   
 69  478917  A         NaN   A         NaN   A         NaN  R   31.0   Z  ...   
 70  478953  A         NaN   A         NaN   A         NaN  R   39.0   R  ...   
 
     Z.1  0.3  R.7   2.1  R.8   2.2  R.9   100  R.10  13  
 0     R  0.0    R   0.0    R   0.0    A   NaN     R   5  
 1     A  NaN    A   NaN    A   NaN    A   NaN     R   9  
 2     R  0.0    R  16.0    R  14.0    R  88.0     R  20  
 3     R  0.0    R  19.0    R  17.0    R  89.0     R  13  
 4     R  0.0    R   1.0    R   0.0    R   0.0     R   9  
 ..  ...  ...  ...   ...  ...   ...  ...   ...   ...  ..  
 66    Z  0.0    R   7.0    R   3.0    R  43.0     R  14  
 67    A  NaN    A   NaN    A   NaN    A   NaN     R  16  
 68    R  0.0    R   6.0    R   4.0    R  67.0     R  11  
 69    A  NaN    A   NaN    A   NaN    A   NaN     R  10  
 70    A  NaN    A   NaN    A   NaN    A   NaN     R  14  
 
 [71 rows x 33 columns],
     479062  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2 A.3  Unnamed: 3 A.4  \
 0   479248  A         NaN   A         NaN   A         NaN   R        81.0   R   
 1   479424  A         NaN   A         NaN   A         NaN   R        43.0   R   
 2   479965  A         NaN   A         NaN   A         NaN   R         4.0   R   
 3   479974  A         NaN   A         NaN   A         NaN   R        29.0   Z   
 4   479983  A         NaN   A         NaN   A         NaN   R         9.0   R   
 ..     ... ..         ...  ..         ...  ..         ...  ..         ...  ..   
 66  481368  A         NaN   A         NaN   A         NaN   R        11.0   R   
 67  481386  A         NaN   A         NaN   A         NaN   R         2.0   Z   
 68  481401  R         3.0   R        23.0   R        13.0   R         0.0   Z   
 69  481410  R        21.0   R        26.0   R        81.0   R        16.0   R   
 70  481429  A         NaN   A         NaN   A         NaN   R        49.0   R   
 
     ...  A.11 Unnamed: 11  A.12 Unnamed: 12  A.13 Unnamed: 13  A.14  \
 0   ...     R         0.0     R         3.0     R         2.0     R   
 1   ...     Z         0.0     R         0.0     Z         0.0     A   
 2   ...     A         NaN     A         NaN     A         NaN     A   
 3   ...     A         NaN     A         NaN     A         NaN     A   
 4   ...     Z         0.0     R         0.0     Z         0.0     A   
 ..  ...   ...         ...   ...         ...   ...         ...   ...   
 66  ...     A         NaN     A         NaN     A         NaN     A   
 67  ...     Z         0.0     R        32.0     R        29.0     R   
 68  ...     Z         0.0     R        11.0     R         8.0     R   
 69  ...     A         NaN     A         NaN     A         NaN     A   
 70  ...     A         NaN     A         NaN     A         NaN     A   
 
    Unnamed: 14  R   6  
 0         67.0  R  15  
 1          NaN  R  15  
 2          NaN  R  12  
 3          NaN  R  20  
 4          NaN  R  15  
 ..         ... ..  ..  
 66         NaN  R   9  
 67        91.0  R  18  
 68        73.0  R  13  
 69         NaN  R  17  
 70         NaN  R  30  
 
 [71 rows x 33 columns],
     481438  R  17 R.1  18 R.2  94 R.3    28 R.4  ...  A.2 Unnamed: 2  A.3  \
 0   481447  A NaN   A NaN   A NaN   A   NaN   A  ...    A        NaN    A   
 1   481456  A NaN   A NaN   A NaN   R   3.0   R  ...    R        0.0    R   
 2   481465  A NaN   A NaN   A NaN   R   3.0   R  ...    A        NaN    A   
 3   481474  A NaN   A NaN   A NaN   R  21.0   R  ...    R        0.0    R   
 4   481483  A NaN   A NaN   A NaN   R  39.0   R  ...    R        0.0    R   
 ..     ... ..  ..  ..  ..  ..  ..  ..   ...  ..  ...  ...        ...  ...   
 66  483221  A NaN   A NaN   A NaN   R   2.0   R  ...    A        NaN    A   
 67  483230  A NaN   A NaN   A NaN   A   NaN   A  ...    A        NaN    A   
 68  483258  A NaN   A NaN   A NaN   R   7.0   Z  ...    A        NaN    A   
 69  483276  A NaN   A NaN   A NaN   R  29.0   R  ...    A        NaN    A   
 70  483328  A NaN   A NaN   A NaN   R  12.0   Z  ...    Z        0.0    R   
 
    Unnamed: 3  A.4 Unnamed: 4  A.5 Unnamed: 5  R.9   6  
 0         NaN    A        NaN    A        NaN    R  12  
 1         4.0    R        3.0    R       75.0    R   6  
 2         NaN    A        NaN    A        NaN    R  10  
 3         5.0    R        5.0    R      100.0    R  15  
 4        18.0    R       14.0    R       78.0    R  15  
 ..        ...  ...        ...  ...        ...  ...  ..  
 66        NaN    A        NaN    A        NaN    R   4  
 67        NaN    A        NaN    A        NaN    R  11  
 68        NaN    A        NaN    A        NaN    R  20  
 69        NaN    A        NaN    A        NaN    R  12  
 70        3.0    R        2.0    R       67.0    R  15  
 
 [71 rows x 33 columns],
     483337  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R    14  Z  ...  \
 0   483346  A         NaN   A         NaN   A         NaN  R   4.0  R  ...   
 1   483355  A         NaN   A         NaN   A         NaN  R  12.0  R  ...   
 2   483364  A         NaN   A         NaN   A         NaN  R   9.0  R  ...   
 3   483373  A         NaN   A         NaN   A         NaN  R   1.0  Z  ...   
 4   483382  A         NaN   A         NaN   A         NaN  A   NaN  A  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ... ..  ...   
 66  484729  A         NaN   A         NaN   A         NaN  A   NaN  A  ...   
 67  484756  A         NaN   A         NaN   A         NaN  R   2.0  R  ...   
 68  484783  A         NaN   A         NaN   A         NaN  A   NaN  A  ...   
 69  484826  A         NaN   A         NaN   A         NaN  R  17.0  Z  ...   
 70  484835  R        22.0   R        64.0   R        34.0  R   2.0  R  ...   
 
     Z.3  0.3  R.5  4.1  R.6    1  R.7  25  R.8  15  
 0     R  0.0    R  0.0    R  0.0    A NaN    R   8  
 1     A  NaN    A  NaN    A  NaN    A NaN    R  10  
 2     A  NaN    A  NaN    A  NaN    A NaN    R   9  
 3     Z  0.0    R  0.0    Z  0.0    A NaN    R  20  
 4     A  NaN    A  NaN    A  NaN    A NaN    R  20  
 ..  ...  ...  ...  ...  ...  ...  ...  ..  ...  ..  
 66    A  NaN    A  NaN    A  NaN    A NaN    R   4  
 67    A  NaN    A  NaN    A  NaN    A NaN    R   8  
 68    A  NaN    A  NaN    A  NaN    A NaN    R   5  
 69    A  NaN    A  NaN    A  NaN    A NaN    R  12  
 70    A  NaN    A  NaN    A  NaN    A NaN    R  14  
 
 [71 rows x 33 columns],
     484862  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2 A.3  Unnamed: 3 A.4  \
 0   484871  R        35.0   R        35.0   R       100.0   R        33.0   Z   
 1   484899  A         NaN   A         NaN   A         NaN   R        59.0   Z   
 2   484905  R       327.0   R       985.0   R        33.0   R       390.0   R   
 3   484923  A         NaN   A         NaN   A         NaN   A         NaN   A   
 4   484932  R       321.0   R       662.0   R        48.0   R       389.0   R   
 ..     ... ..         ...  ..         ...  ..         ...  ..         ...  ..   
 66  486406  A         NaN   A         NaN   A         NaN   R         3.0   R   
 67  486415  A         NaN   A         NaN   A         NaN   R        46.0   R   
 68  486424  A         NaN   A         NaN   A         NaN   R        33.0   R   
 69  486442  A         NaN   A         NaN   A         NaN   R         5.0   Z   
 70  486488  R         2.0   R        12.0   R        17.0   R         1.0   R   
 
     ...  A.11 Unnamed: 11  A.12 Unnamed: 12  A.13 Unnamed: 13  A.14  \
 0   ...     A         NaN     A         NaN     A         NaN     A   
 1   ...     A         NaN     A         NaN     A         NaN     A   
 2   ...     R         0.0     R        34.0     R        12.0     R   
 3   ...     A         NaN     A         NaN     A         NaN     A   
 4   ...     R         0.0     R       136.0     R        49.0     R   
 ..  ...   ...         ...   ...         ...   ...         ...   ...   
 66  ...     A         NaN     A         NaN     A         NaN     A   
 67  ...     A         NaN     A         NaN     A         NaN     A   
 68  ...     R         0.0     R         3.0     R         3.0     R   
 69  ...     A         NaN     A         NaN     A         NaN     A   
 70  ...     R         0.0     R         0.0     R         0.0     A   
 
    Unnamed: 14  R   5  
 0          NaN  R  16  
 1          NaN  R  14  
 2         35.0  R  17  
 3          NaN  R  12  
 4         36.0  R  12  
 ..         ... ..  ..  
 66         NaN  R   3  
 67         NaN  R  30  
 68       100.0  R  15  
 69         NaN  R  20  
 70         NaN  R   5  
 
 [71 rows x 33 columns],
     486497  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     9 R.1  ...  \
 0   486503  A         NaN   A         NaN   A         NaN  R  14.0   R  ...   
 1   486512  A         NaN   A         NaN   A         NaN  R  30.0   R  ...   
 2   486530  A         NaN   A         NaN   A         NaN  R  10.0   Z  ...   
 3   486558  A         NaN   A         NaN   A         NaN  R  29.0   Z  ...   
 4   486567  A         NaN   A         NaN   A         NaN  R  39.0   Z  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ...  ..  ...   
 66  487852  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 67  487861  R        63.0   R        86.0   R        73.0  R   0.0   R  ...   
 68  487889  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 69  487898  A         NaN   A         NaN   A         NaN  R   7.0   Z  ...   
 70  487904  A         NaN   A         NaN   A         NaN  R  12.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6  25  
 0     R        0.0    R        0.0    R        0.0    A        NaN    R  14  
 1     A        NaN    A        NaN    A        NaN    A        NaN    R  30  
 2     A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R  12  
 4     A        NaN    A        NaN    A        NaN    A        NaN    R  12  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    A        NaN    R   5  
 67    R        0.0    R        0.0    Z        0.0    A        NaN    R  22  
 68    A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 69    Z        0.0    R       19.0    R        9.0    R       47.0    R   9  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 
 [71 rows x 33 columns],
     487913  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R      0 R.1  ...  \
 0   487922  A         NaN   A         NaN   A         NaN  R    5.0   R  ...   
 1   487959  A         NaN   A         NaN   A         NaN  R   27.0   R  ...   
 2   487968  A         NaN   A         NaN   A         NaN  R   64.0   R  ...   
 3   487977  A         NaN   A         NaN   A         NaN  R   61.0   R  ...   
 4   487995  A         NaN   A         NaN   A         NaN  A    NaN   A  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ...  ..  ...   
 66  489247  A         NaN   A         NaN   A         NaN  R    3.0   R  ...   
 67  489256  A         NaN   A         NaN   A         NaN  R  204.0   R  ...   
 68  489283  A         NaN   A         NaN   A         NaN  R  171.0   Z  ...   
 69  489344  A         NaN   A         NaN   A         NaN  R    2.0   R  ...   
 70  489353  R         1.0   R        35.0   R         3.0  R    0.0   R  ...   
 
     R.7  0.7  R.8    0.8  R.9    0.9  A.4 Unnamed: 4  R.10  20  
 0     A  NaN    A    NaN    A    NaN    A        NaN     R  23  
 1     R  0.0    R    0.0    R    0.0    A        NaN     R  20  
 2     A  NaN    A    NaN    A    NaN    A        NaN     R  10  
 3     R  0.0    R  151.0    R  107.0    R       71.0     R  58  
 4     A  NaN    A    NaN    A    NaN    A        NaN     R  15  
 ..  ...  ...  ...    ...  ...    ...  ...        ...   ...  ..  
 66    R  0.0    R    5.0    R    4.0    R       80.0     R   3  
 67    Z  0.0    R    4.0    R    4.0    R      100.0     R  25  
 68    A  NaN    A    NaN    A    NaN    A        NaN     R  24  
 69    R  0.0    R    0.0    R    0.0    A        NaN     R  13  
 70    R  0.0    R    1.0    R    1.0    R      100.0     R  11  
 
 [71 rows x 33 columns],
     489371  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R    75 R.1  ...  \
 0   489760  A         NaN   A         NaN   A         NaN  R  61.0   Z  ...   
 1   489779  A         NaN   A         NaN   A         NaN  R   3.0   R  ...   
 2   489812  A         NaN   A         NaN   A         NaN  R  23.0   R  ...   
 3   489830  A         NaN   A         NaN   A         NaN  R  17.0   R  ...   
 4   489937  R        65.0   R       178.0   R        37.0  R  59.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ...  ..  ...   
 66  490975  A         NaN   A         NaN   A         NaN  R  26.0   Z  ...   
 67  491057  R         8.0   R        69.0   R        12.0  R   3.0   R  ...   
 68  491066  A         NaN   A         NaN   A         NaN  R  10.0   R  ...   
 69  491075  A         NaN   A         NaN   A         NaN  R  24.0   R  ...   
 70  491084  A         NaN   A         NaN   A         NaN  R  38.0   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.6  13  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R   5  
 1     Z        0.0    R      439.0    R      144.0    R       33.0    R  28  
 2     A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 4     R        0.0    R        1.0    R        0.0    R        0.0    R  10  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    Z        0.0    R        2.0    R        1.0    R       50.0    R  31  
 67    A        NaN    A        NaN    A        NaN    A        NaN    R  19  
 68    R        0.0    R       11.0    R        1.0    R        9.0    R   9  
 69    A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 
 [71 rows x 33 columns],
     491136  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R    35 R.1  ...  \
 0   491145  A         NaN   A         NaN   A         NaN  R  33.0   R  ...   
 1   491181  A         NaN   A         NaN   A         NaN  R   2.0   R  ...   
 2   491215  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 3   491224  A         NaN   A         NaN   A         NaN  R   1.0   Z  ...   
 4   491233  A         NaN   A         NaN   A         NaN  R   9.0   R  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..   ...  ..  ...   
 66  492281  A         NaN   A         NaN   A         NaN  R   4.0   R  ...   
 67  492306  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 68  492315  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 69  492324  A         NaN   A         NaN   A         NaN  A   NaN   A  ...   
 70  492360  A         NaN   A         NaN   A         NaN  R   6.0   R  ...   
 
     Z.1  0.3  R.7  4.1  R.8    1  R.9     25  R.10  17  
 0     Z  0.0    R  9.0    R  6.0    R   67.0     R  16  
 1     R  0.0    R  1.0    R  0.0    R    0.0     R  16  
 2     A  NaN    A  NaN    A  NaN    A    NaN     R  11  
 3     Z  0.0    R  4.0    R  4.0    R  100.0     R  18  
 4     R  0.0    R  1.0    R  0.0    R    0.0     R  20  
 ..  ...  ...  ...  ...  ...  ...  ...    ...   ...  ..  
 66    A  NaN    A  NaN    A  NaN    A    NaN     R   2  
 67    A  NaN    A  NaN    A  NaN    A    NaN     R  15  
 68    A  NaN    A  NaN    A  NaN    A    NaN     R  11  
 69    A  NaN    A  NaN    A  NaN    A    NaN     R  18  
 70    R  0.0    R  2.0    R  2.0    R  100.0     R   8  
 
 [71 rows x 33 columns],
     492379  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2 A.3  Unnamed: 3 A.4  \
 0   492397  A         NaN   A         NaN   A         NaN   R         8.0   R   
 1   492421  A         NaN   A         NaN   A         NaN   R         9.0   Z   
 2   492449  A         NaN   A         NaN   A         NaN   R        46.0   R   
 3   492476  R         7.0   R        60.0   R        12.0   R         0.0   R   
 4   492591  A         NaN   A         NaN   A         NaN   R         2.0   R   
 ..     ... ..         ...  ..         ...  ..         ...  ..         ...  ..   
 66  494250  A         NaN   A         NaN   A         NaN   R         8.0   R   
 67  494269  A         NaN   A         NaN   A         NaN   Z         0.0   Z   
 68  494278  R        11.0   R        66.0   R        17.0   R         1.0   R   
 69  494287  R        27.0   R       101.0   R        27.0   R         0.0   R   
 70  494357  A         NaN   A         NaN   A         NaN   R       151.0   R   
 
     ...  A.11 Unnamed: 11  A.12 Unnamed: 12  A.13 Unnamed: 13  A.14  \
 0   ...     A         NaN     A         NaN     A         NaN     A   
 1   ...     A         NaN     A         NaN     A         NaN     A   
 2   ...     A         NaN     A         NaN     A         NaN     A   
 3   ...     R         0.0     R         1.0     R         0.0     R   
 4   ...     A         NaN     A         NaN     A         NaN     A   
 ..  ...   ...         ...   ...         ...   ...         ...   ...   
 66  ...     A         NaN     A         NaN     A         NaN     A   
 67  ...     Z         0.0     R         0.0     R         0.0     A   
 68  ...     R         0.0     R         0.0     R         0.0     A   
 69  ...     R         0.0     R         0.0     R         0.0     A   
 70  ...     A         NaN     A         NaN     A         NaN     A   
 
    Unnamed: 14  R  12  
 0          NaN  R  14  
 1          NaN  R  20  
 2          NaN  R  15  
 3          0.0  R  13  
 4          NaN  R   7  
 ..         ... ..  ..  
 66         NaN  R  18  
 67         NaN  R   1  
 68         NaN  R  19  
 69         NaN  R  16  
 70         NaN  R  30  
 
 [71 rows x 33 columns],
     494436  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R     30  Z  ...  \
 0   494463  A         NaN   A         NaN   A         NaN  R  146.0  R  ...   
 1   494472  A         NaN   A         NaN   A         NaN  R    0.0  R  ...   
 2   494524  A         NaN   A         NaN   A         NaN  R   25.0  R  ...   
 3   494588  A         NaN   A         NaN   A         NaN  R   37.0  R  ...   
 4   494621  A         NaN   A         NaN   A         NaN  A    NaN  A  ...   
 ..     ... ..         ...  ..         ...  ..         ... ..    ... ..  ...   
 66  495776  A         NaN   A         NaN   A         NaN  R    0.0  R  ...   
 67  495794  A         NaN   A         NaN   A         NaN  R   13.0  Z  ...   
 68  495837  A         NaN   A         NaN   A         NaN  R   59.0  R  ...   
 69  495916  A         NaN   A         NaN   A         NaN  R    1.0  Z  ...   
 70  495925  A         NaN   A         NaN   A         NaN  R   35.0  R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.4  18  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R  41  
 1     R        0.0    R        2.0    R        1.0    R       50.0    R   5  
 2     R        0.0    R       17.0    R       14.0    R       82.0    R  14  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R  10  
 4     Z        0.0    R        0.0    R        0.0    A        NaN    R   6  
 ..  ...        ...  ...        ...  ...        ...  ...        ...  ...  ..  
 66    A        NaN    A        NaN    A        NaN    A        NaN    R  12  
 67    Z        0.0    R        1.0    R        1.0    R      100.0    R  17  
 68    A        NaN    A        NaN    A        NaN    A        NaN    R  21  
 69    Z        0.0    R        0.0    Z        0.0    A        NaN    R   2  
 70    A        NaN    A        NaN    A        NaN    A        NaN    R  43  
 
 [71 rows x 33 columns],
     495934  A  Unnamed: 0 A.1  Unnamed: 1 A.2  Unnamed: 2  R  25 R.1  ...  \
 0   495943  A         NaN   A         NaN   A         NaN  R  14   R  ...   
 1   495952  A         NaN   A         NaN   A         NaN  R   0   R  ...   
 2   495961  A         NaN   A         NaN   A         NaN  R   0   Z  ...   
 3   495970  A         NaN   A         NaN   A         NaN  R  23   R  ...   
 4   495998  A         NaN   A         NaN   A         NaN  R  30   R  ...   
 5   496052  A         NaN   A         NaN   A         NaN  R   3   R  ...   
 6   496061  A         NaN   A         NaN   A         NaN  R  23   R  ...   
 7   496186  A         NaN   A         NaN   A         NaN  R   0   R  ...   
 8   496265  R        11.0   R        13.0   R        85.0  R   1   R  ...   
 9   496283  A         NaN   A         NaN   A         NaN  R   0   Z  ...   
 10  496326  A         NaN   A         NaN   A         NaN  R   0   Z  ...   
 11  496371  A         NaN   A         NaN   A         NaN  R   0   Z  ...   
 12  496423  A         NaN   A         NaN   A         NaN  R   6   R  ...   
 
     A.5 Unnamed: 5  A.6 Unnamed: 6  A.7 Unnamed: 7  A.8 Unnamed: 8  R.5  43  
 0     A        NaN    A        NaN    A        NaN    A        NaN    R  63  
 1     A        NaN    A        NaN    A        NaN    A        NaN    R  17  
 2     A        NaN    A        NaN    A        NaN    A        NaN    R  28  
 3     A        NaN    A        NaN    A        NaN    A        NaN    R  42  
 4     A        NaN    A        NaN    A        NaN    A        NaN    R  61  
 5     R        0.0    R        0.0    R        0.0    A        NaN    R   3  
 6     R        0.0    R        4.0    R        0.0    R        0.0    R  25  
 7     A        NaN    A        NaN    A        NaN    A        NaN    R   8  
 8     A        NaN    A        NaN    A        NaN    A        NaN    R   7  
 9     Z        0.0    R        0.0    Z        0.0    A        NaN    R   8  
 10    Z        0.0    R        0.0    Z        0.0    A        NaN    R   7  
 11    A        NaN    A        NaN    A        NaN    A        NaN    R  16  
 12    R        0.0    R        0.0    R        0.0    A        NaN    R   6  
 
 [13 rows x 33 columns]]
In [67]:
en = en[0]
In [68]:
en.head()
Out[68]:
UNITID XGRCOHR GRCOHRT XUGENTER UGENTER XPGRCOH PGRCOHR XRRFTCT RRFTCT XRRFTEX ... XRRPTIN RRPTIN XRRPTCTA RRPTCTA XRET_NM.1 RET_NMP XRET_PCP RET_PCP XSTUFACR STUFACR
0 100654 R 1525.0 R 1712.0 R 89.0 R 1688.0 R ... R 0.0 R 6.0 R 2.0 R 33.0 R 18
1 100663 R 2102.0 R 3529.0 R 60.0 R 2294.0 R ... R 0.0 R 42.0 R 20.0 R 48.0 R 20
2 100690 A NaN A NaN A NaN R 2.0 Z ... A NaN A NaN A NaN A NaN R 13
3 100706 R 1328.0 R 2186.0 R 61.0 R 1489.0 R ... R 0.0 R 8.0 R 2.0 R 25.0 R 19
4 100724 R 926.0 R 1123.0 R 82.0 R 1000.0 R ... R 0.0 R 19.0 R 1.0 R 5.0 R 15

5 rows × 33 columns

In [69]:
en.tail()
Out[69]:
UNITID XGRCOHR GRCOHRT XUGENTER UGENTER XPGRCOH PGRCOHR XRRFTCT RRFTCT XRRFTEX ... XRRPTIN RRPTIN XRRPTCTA RRPTCTA XRET_NM.1 RET_NMP XRET_PCP RET_PCP XSTUFACR STUFACR
66 103927 A NaN A NaN A NaN R 41.0 Z ... A NaN A NaN A NaN A NaN R 25
67 103945 A NaN A NaN A NaN R 2.0 Z ... A NaN A NaN A NaN A NaN R 4
68 103954 A NaN A NaN A NaN R 3.0 R ... A NaN A NaN A NaN A NaN R 20
69 103963 A NaN A NaN A NaN R 99.0 R ... R 0.0 R 0.0 R 0.0 A NaN R 10
70 104090 A NaN A NaN A NaN R 19.0 R ... R 0.0 R 0.0 R 0.0 A NaN R 24

5 rows × 33 columns

Remove Columns that Contain Codes¶

In [70]:
en.columns
# Let's see if we can remove the columns that contain an 'X'
Out[70]:
Index(['UNITID', 'XGRCOHR', 'GRCOHRT', 'XUGENTER', 'UGENTER', 'XPGRCOH',
       'PGRCOHR', 'XRRFTCT', 'RRFTCT', 'XRRFTEX', 'RRFTEX', 'XRRFTIN',
       'RRFTIN', 'XRRFTCTA', 'RRFTCTA', 'XRET_NM', 'RET_NMF', 'XRET_PCF',
       'RET_PCF', 'XRRPTCT', 'RRPTCT', 'XRRPTEX', 'RRPTEX', 'XRRPTIN',
       'RRPTIN', 'XRRPTCTA', 'RRPTCTA', 'XRET_NM.1', 'RET_NMP', 'XRET_PCP',
       'RET_PCP', 'XSTUFACR', 'STUFACR'],
      dtype='object')
In [71]:
cols_to_keep = []

for c in en.columns:
    if 'X' not in c: 
        cols_to_keep.append(c)
In [72]:
cols_to_keep = []

for c in en.columns:
    if 'X' not in c: 
        cols_to_keep.append(c)
In [73]:
cols_to_keep
Out[73]:
['UNITID',
 'GRCOHRT',
 'UGENTER',
 'PGRCOHR',
 'RRFTCT',
 'RRFTIN',
 'RRFTCTA',
 'RET_NMF',
 'RET_PCF',
 'RRPTCT',
 'RRPTIN',
 'RRPTCTA',
 'RET_NMP',
 'RET_PCP',
 'STUFACR']
In [74]:
en2 = en[cols_to_keep]
In [75]:
en2.iloc[:, 1:].describe()
Out[75]:
GRCOHRT UGENTER PGRCOHR RRFTCT RRFTIN RRFTCTA RET_NMF RET_PCF RRPTCT RRPTIN RRPTCTA RET_NMP RET_PCP STUFACR
count 56.000000 56.000000 56.000000 69.000000 69.0 69.000000 69.000000 68.000000 57.000000 57.0 57.000000 57.000000 50.000000 71.000000
mean 688.464286 1279.821429 50.375000 631.014493 0.0 630.811594 460.000000 67.588235 91.578947 0.0 91.526316 40.649123 42.140000 15.014085
std 1085.246230 1481.970201 24.002699 1032.230758 0.0 1032.015992 900.678502 13.942564 151.390336 0.0 151.322162 81.625340 24.603808 5.362550
min 1.000000 12.000000 6.000000 0.000000 0.0 0.000000 0.000000 34.000000 0.000000 0.0 0.000000 0.000000 0.000000 4.000000
25% 188.000000 341.500000 33.750000 66.000000 0.0 66.000000 41.000000 58.500000 4.000000 0.0 4.000000 2.000000 29.750000 12.000000
50% 354.000000 789.000000 43.000000 320.000000 0.0 320.000000 208.000000 65.000000 24.000000 0.0 24.000000 9.000000 38.000000 15.000000
75% 898.250000 1544.750000 69.750000 789.000000 0.0 789.000000 435.000000 77.250000 133.000000 0.0 133.000000 42.000000 50.750000 18.000000
max 6466.000000 7948.000000 97.000000 6734.000000 0.0 6733.000000 5873.000000 100.000000 761.000000 0.0 760.000000 462.000000 100.000000 35.000000
In [76]:
en2.iloc[:,1:].corr()
Out[76]:
GRCOHRT UGENTER PGRCOHR RRFTCT RRFTIN RRFTCTA RET_NMF RET_PCF RRPTCT RRPTIN RRPTCTA RET_NMP RET_PCP STUFACR
GRCOHRT 1.000000 0.928805 0.319417 0.994914 NaN 0.994905 0.994323 0.379099 0.043763 NaN 0.042785 0.080625 0.162304 0.484789
UGENTER 0.928805 1.000000 0.125399 0.932618 NaN 0.932670 0.909415 0.293813 0.343800 NaN 0.343019 0.376257 0.147883 0.599959
PGRCOHR 0.319417 0.125399 1.000000 0.301361 NaN 0.301289 0.307085 0.139696 -0.306241 NaN -0.306675 -0.279614 -0.129778 0.098439
RRFTCT 0.994914 0.932618 0.301361 1.000000 NaN 1.000000 0.991601 0.234453 0.064328 NaN 0.063448 0.032688 0.103813 0.253491
RRFTIN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
RRFTCTA 0.994905 0.932670 0.301289 1.000000 NaN 1.000000 0.991605 0.234525 0.064436 NaN 0.063556 0.032769 0.103779 0.253546
RET_NMF 0.994323 0.909415 0.307085 0.991601 NaN 0.991605 1.000000 0.298123 0.006206 NaN 0.005252 -0.000583 0.130204 0.237952
RET_PCF 0.379099 0.293813 0.139696 0.234453 NaN 0.234525 0.298123 1.000000 -0.211501 NaN -0.212017 -0.050746 0.100858 0.027222
RRPTCT 0.043763 0.343800 -0.306241 0.064328 NaN 0.064436 0.006206 -0.211501 1.000000 NaN 0.999998 0.896768 0.065188 0.276120
RRPTIN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
RRPTCTA 0.042785 0.343019 -0.306675 0.063448 NaN 0.063556 0.005252 -0.212017 0.999998 NaN 1.000000 0.896813 0.064872 0.275766
RET_NMP 0.080625 0.376257 -0.279614 0.032688 NaN 0.032769 -0.000583 -0.050746 0.896768 NaN 0.896813 1.000000 0.249742 0.178385
RET_PCP 0.162304 0.147883 -0.129778 0.103813 NaN 0.103779 0.130204 0.100858 0.065188 NaN 0.064872 0.249742 1.000000 0.060263
STUFACR 0.484789 0.599959 0.098439 0.253491 NaN 0.253546 0.237952 0.027222 0.276120 NaN 0.275766 0.178385 0.060263 1.000000
In [77]:
# Let's clear the heatmap for duplicate correlation values
mask = np.zeros_like(en2.iloc[:,1:].corr())
mask[np.triu_indices_from(mask)] = True
# Time to plot! 
with sns.axes_style("white"):
    f, ax = plt.subplots(figsize = (15,9))
    sns.heatmap(en2.iloc[:,1:].corr(),mask=mask,
           cmap = 'RdBu')
In [78]:
mask = np.zeros_like(en2.iloc[:,1:].corr())
mask[np.triu_indices_from(mask)] = True
with sns.axes_style("white"):
    f, ax = plt.subplots(figsize = (15,9))
    sns.heatmap(en2.iloc[:,1:].corr(),mask=mask,
           cmap = 'RdBu')

Changing a Data Type¶

In [79]:
en2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 71 entries, 0 to 70
Data columns (total 15 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   UNITID   71 non-null     int64  
 1   GRCOHRT  56 non-null     float64
 2   UGENTER  56 non-null     float64
 3   PGRCOHR  56 non-null     float64
 4   RRFTCT   69 non-null     float64
 5   RRFTIN   69 non-null     float64
 6   RRFTCTA  69 non-null     float64
 7   RET_NMF  69 non-null     float64
 8   RET_PCF  68 non-null     float64
 9   RRPTCT   57 non-null     float64
 10  RRPTIN   57 non-null     float64
 11  RRPTCTA  57 non-null     float64
 12  RET_NMP  57 non-null     float64
 13  RET_PCP  50 non-null     float64
 14  STUFACR  71 non-null     int64  
dtypes: float64(13), int64(2)
memory usage: 8.4 KB
In [80]:
en2.loc['STUFACR'] = en2.STUFACR.astype('float')
In [81]:
en2.info()
<class 'pandas.core.frame.DataFrame'>
Index: 72 entries, 0 to STUFACR
Data columns (total 15 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   UNITID   71 non-null     float64
 1   GRCOHRT  56 non-null     float64
 2   UGENTER  56 non-null     float64
 3   PGRCOHR  56 non-null     float64
 4   RRFTCT   69 non-null     float64
 5   RRFTIN   69 non-null     float64
 6   RRFTCTA  69 non-null     float64
 7   RET_NMF  69 non-null     float64
 8   RET_PCF  68 non-null     float64
 9   RRPTCT   57 non-null     float64
 10  RRPTIN   57 non-null     float64
 11  RRPTCTA  57 non-null     float64
 12  RET_NMP  57 non-null     float64
 13  RET_PCP  50 non-null     float64
 14  STUFACR  71 non-null     float64
dtypes: float64(15)
memory usage: 9.0+ KB

Checking for NAs¶

In [82]:
for n in en2:
  print("The percent of missing values in {0} is {1}".format(n, en2[[n]].isna().mean().round(2)))
The percent of missing values in UNITID is UNITID    0.01
dtype: float64
The percent of missing values in GRCOHRT is GRCOHRT    0.22
dtype: float64
The percent of missing values in UGENTER is UGENTER    0.22
dtype: float64
The percent of missing values in PGRCOHR is PGRCOHR    0.22
dtype: float64
The percent of missing values in RRFTCT is RRFTCT    0.04
dtype: float64
The percent of missing values in RRFTIN is RRFTIN    0.04
dtype: float64
The percent of missing values in RRFTCTA is RRFTCTA    0.04
dtype: float64
The percent of missing values in RET_NMF is RET_NMF    0.04
dtype: float64
The percent of missing values in RET_PCF is RET_PCF    0.06
dtype: float64
The percent of missing values in RRPTCT is RRPTCT    0.21
dtype: float64
The percent of missing values in RRPTIN is RRPTIN    0.21
dtype: float64
The percent of missing values in RRPTCTA is RRPTCTA    0.21
dtype: float64
The percent of missing values in RET_NMP is RET_NMP    0.21
dtype: float64
The percent of missing values in RET_PCP is RET_PCP    0.31
dtype: float64
The percent of missing values in STUFACR is STUFACR    0.01
dtype: float64
In [83]:
for n in en2:
  print("The percent of missing values in {0} is {1}".format(n, en2[[n]].isna().mean()))
The percent of missing values in UNITID is UNITID    0.013889
dtype: float64
The percent of missing values in GRCOHRT is GRCOHRT    0.222222
dtype: float64
The percent of missing values in UGENTER is UGENTER    0.222222
dtype: float64
The percent of missing values in PGRCOHR is PGRCOHR    0.222222
dtype: float64
The percent of missing values in RRFTCT is RRFTCT    0.041667
dtype: float64
The percent of missing values in RRFTIN is RRFTIN    0.041667
dtype: float64
The percent of missing values in RRFTCTA is RRFTCTA    0.041667
dtype: float64
The percent of missing values in RET_NMF is RET_NMF    0.041667
dtype: float64
The percent of missing values in RET_PCF is RET_PCF    0.055556
dtype: float64
The percent of missing values in RRPTCT is RRPTCT    0.208333
dtype: float64
The percent of missing values in RRPTIN is RRPTIN    0.208333
dtype: float64
The percent of missing values in RRPTCTA is RRPTCTA    0.208333
dtype: float64
The percent of missing values in RET_NMP is RET_NMP    0.208333
dtype: float64
The percent of missing values in RET_PCP is RET_PCP    0.305556
dtype: float64
The percent of missing values in STUFACR is STUFACR    0.013889
dtype: float64

Filling Missing Values with the Mean¶

In [84]:
ret_mean = en2.RET_PCF.mean().round(2)
ret_na = en2.RET_PCF.isna().mean().round(2)

print('The mean for retention is {0} and the missing values is {1}'.format(ret_mean, ret_na))
The mean for retention is 67.59 and the missing values is 0.06
In [85]:
en2['RET_PCF'].fillna(en2.RET_PCF.mean(), inplace=True)
In [86]:
ret_mean = en2.RET_PCF.mean().round(2)
ret_na = en2.RET_PCF.isna().mean().round(2)

print('The mean for retention is {0} and the missing values is {1}'.format(ret_mean, ret_na))
The mean for retention is 67.59 and the missing values is 0.0

Create a Feature¶

In [87]:
en2['Higher_Than_Avg_Ret_Rate'] = en2.RET_PCF > en.RET_PCF.mean()
en2['Ret_Index'] = (en2.RET_PCF-en.RET_PCF.mean())/en2.RET_PCF.std()
In [88]:
en2.head()
Out[88]:
UNITID GRCOHRT UGENTER PGRCOHR RRFTCT RRFTIN RRFTCTA RET_NMF RET_PCF RRPTCT RRPTIN RRPTCTA RET_NMP RET_PCP STUFACR Higher_Than_Avg_Ret_Rate Ret_Index
0 100654.0 1525.0 1712.0 89.0 1688.0 0.0 1686.0 911.0 54.0 6.0 0.0 6.0 2.0 33.0 18.0 False -1.003257
1 100663.0 2102.0 3529.0 60.0 2294.0 0.0 2294.0 1982.0 86.0 42.0 0.0 42.0 20.0 48.0 20.0 True 1.359392
2 100690.0 NaN NaN NaN 2.0 0.0 2.0 1.0 50.0 NaN NaN NaN NaN NaN 13.0 False -1.298588
3 100706.0 1328.0 2186.0 61.0 1489.0 0.0 1489.0 1218.0 82.0 8.0 0.0 8.0 2.0 25.0 19.0 True 1.064060
4 100724.0 926.0 1123.0 82.0 1000.0 0.0 998.0 619.0 62.0 19.0 0.0 19.0 1.0 5.0 15.0 False -0.412595
In [89]:
en2[['Higher_Than_Avg_Ret_Rate', 'Ret_Index']][60:70]
Out[89]:
Higher_Than_Avg_Ret_Rate Ret_Index
60 True 0.990228
61 False 0.000000
62 True 0.251900
63 False -0.043431
64 True 0.325733
65 False -0.634093
66 True 0.768729
67 True 2.393050
68 True 2.393050
69 False -1.372421
In [90]:
en2.Higher_Than_Avg_Ret_Rate.mean()
Out[90]:
0.4166666666666667
In [91]:
en2.Ret_Index.mean()
Out[91]:
-3.762422472340808e-16
In [92]:
grouped_en = en2[['Ret_Index', 'Higher_Than_Avg_Ret_Rate']].groupby('Higher_Than_Avg_Ret_Rate')
grouped_en.describe()
Out[92]:
Ret_Index
count mean std min 25% 50% 75% max
Higher_Than_Avg_Ret_Rate
False 42.0 -0.668631 0.512287 -2.479912 -0.929424 -0.634093 -0.338762 0.00000
True 30.0 0.936084 0.723393 0.030402 0.418024 0.768729 1.285559 2.39305
In [93]:
grouped_en.mean().plot(kind = 'bar')
plt.show()

Filtering¶

In [94]:
code = pd.read_excel('IPEDS201920Tablesdoc.xlsx', sheet_name= 'valueSets19')
In [95]:
code.head()
Out[95]:
SurveyOrder Tablenumber TableName varNumber varOrder varName Codevalue Frequency Percent valueOrder valueLabel varTitle
0 10 10 HD2019 10016 7 STABBR AL 83.0 1.27 1 Alabama State abbreviation
1 10 10 HD2019 10016 7 STABBR AK 10.0 0.15 2 Alaska State abbreviation
2 10 10 HD2019 10016 7 STABBR AZ 117.0 1.78 3 Arizona State abbreviation
3 10 10 HD2019 10016 7 STABBR AR 88.0 1.34 4 Arkansas State abbreviation
4 10 10 HD2019 10016 7 STABBR CA 699.0 10.66 5 California State abbreviation
In [96]:
hd2019var = code[code['TableName'] == 'HD2019']
In [97]:
hd2019var[hd2019var['varName'] == 'SECTOR']
Out[97]:
SurveyOrder Tablenumber TableName varNumber varOrder varName Codevalue Frequency Percent valueOrder valueLabel varTitle
136 10 10 HD2019 10086 32 SECTOR 0 73.0 1.11 1 Administrative Unit Sector of institution
137 10 10 HD2019 10086 32 SECTOR 1 807.0 12.30 2 Public, 4-year or above Sector of institution
138 10 10 HD2019 10086 32 SECTOR 2 1685.0 25.69 3 Private not-for-profit, 4-year or above Sector of institution
139 10 10 HD2019 10086 32 SECTOR 3 390.0 5.95 4 Private for-profit, 4-year or above Sector of institution
140 10 10 HD2019 10086 32 SECTOR 4 948.0 14.45 5 Public, 2-year Sector of institution
141 10 10 HD2019 10086 32 SECTOR 5 153.0 2.33 6 Private not-for-profit, 2-year Sector of institution
142 10 10 HD2019 10086 32 SECTOR 6 620.0 9.45 7 Private for-profit, 2-year Sector of institution
143 10 10 HD2019 10086 32 SECTOR 7 235.0 3.58 8 Public, less-than 2-year Sector of institution
144 10 10 HD2019 10086 32 SECTOR 8 64.0 0.98 9 Private not-for-profit, less-than 2-year Sector of institution
145 10 10 HD2019 10086 32 SECTOR 9 1552.0 23.66 10 Private for-profit, less-than 2-year Sector of institution
146 10 10 HD2019 10086 32 SECTOR 99 32.0 0.49 11 Sector unknown (not active) Sector of institution
In [98]:
hd2019var[['Codevalue', 'valueLabel']][hd2019var['varName']== 'SECTOR']
Out[98]:
Codevalue valueLabel
136 0 Administrative Unit
137 1 Public, 4-year or above
138 2 Private not-for-profit, 4-year or above
139 3 Private for-profit, 4-year or above
140 4 Public, 2-year
141 5 Private not-for-profit, 2-year
142 6 Private for-profit, 2-year
143 7 Public, less-than 2-year
144 8 Private not-for-profit, less-than 2-year
145 9 Private for-profit, less-than 2-year
146 99 Sector unknown (not active)

Joining Dataframes¶

In [99]:
sector = hd2019var[['Codevalue', 'valueLabel']][hd2019var['varName']== 'SECTOR']
sector.columns = ['SECTOR', 'SECTOR LABEL']
sector.SECTOR = sector.SECTOR.astype(int)
In [100]:
hd2 = pd.merge(hd, sector, on = 'SECTOR', how = 'left')
In [101]:
hd2.head()
Out[101]:
UNITID INSTNM IALIAS CITY STABBR ZIP CHFNM CHFTITLE OPEID SECTOR HBCU HLOFFER UGOFFER SECTOR LABEL
0 100654 Alabama A & M University AAMU Normal AL 35762 Dr. Andrew Hugine, Jr. President 100200 1 1 9 1 Public, 4-year or above
1 100663 University of Alabama at Birmingham Birmingham AL 35294-0110 Ray L. Watts President 105200 1 2 9 1 Public, 4-year or above
2 100690 Amridge University Southern Christian University Regions University Montgomery AL 36117-3553 Michael C.Turner President 2503400 2 2 9 1 Private not-for-profit, 4-year or above
3 100706 University of Alabama in Huntsville UAH University of Alabama Huntsville Huntsville AL 35899 Darren Dawson President 105500 1 2 9 1 Public, 4-year or above
4 100724 Alabama State University Montgomery AL 36104-0271 Quinton T. Ross President 100500 1 1 9 1 Public, 4-year or above
In [102]:
figsize = (12, 6)

plt.figure(figsize = figsize)
sns.countplot(x = 'SECTOR LABEL', data = hd2)
plt.xlabel("Sector of Institution")
plt.ylabel("Number of Institutions")
plt.title("Count of Institutions by Sector")
plt.xticks(rotation = 90)

plt.show()
In [103]:
figsize = (12, 6)

plt.figure(figsize = figsize)
sns.countplot(x = 'SECTOR LABEL', data = hd2)
plt.xlabel("Sector of Institution")
plt.ylabel("Number of Institutions")
plt.title("Count of Institutions by Sector")
plt.xticks(rotation = 90)

plt.show()
In [104]:
figsize = (12, 6)

plt.figure(figsize = figsize)
sns.countplot(x = 'SECTOR LABEL', data = hd2, hue = 'UGOFFER', palette='Set2')
sns.color_palette('Set2')
plt.xlabel("Sector of Institution")
plt.ylabel("Number of Institutions")
plt.title("Count of Institutions by Sector")
plt.xticks(rotation = 90)

plt.show()
In [105]:
figsize = (12, 6)

plt.figure(figsize = figsize)
sns.countplot(x = 'SECTOR LABEL', data = hd2, hue = 'UGOFFER', palette='Set2')
sns.color_palette('Set2')
plt.xlabel("Sector of Institution")
plt.ylabel("Number of Institutions")
plt.title("Count of Institutions by Sector")
plt.xticks(rotation = 90)

plt.show()

Pivot Tables¶

In [106]:
ids = hd2[['UNITID', 'INSTNM', 'SECTOR LABEL']]
ids.head()
Out[106]:
UNITID INSTNM SECTOR LABEL
0 100654 Alabama A & M University Public, 4-year or above
1 100663 University of Alabama at Birmingham Public, 4-year or above
2 100690 Amridge University Private not-for-profit, 4-year or above
3 100706 University of Alabama in Huntsville Public, 4-year or above
4 100724 Alabama State University Public, 4-year or above
In [107]:
print(en2.head())
en2 = pd.merge(en2, ids, on = 'UNITID', how = 'left')
     UNITID  GRCOHRT  UGENTER  PGRCOHR  RRFTCT  RRFTIN  RRFTCTA  RET_NMF  \
0  100654.0   1525.0   1712.0     89.0  1688.0     0.0   1686.0    911.0   
1  100663.0   2102.0   3529.0     60.0  2294.0     0.0   2294.0   1982.0   
2  100690.0      NaN      NaN      NaN     2.0     0.0      2.0      1.0   
3  100706.0   1328.0   2186.0     61.0  1489.0     0.0   1489.0   1218.0   
4  100724.0    926.0   1123.0     82.0  1000.0     0.0    998.0    619.0   

   RET_PCF  RRPTCT  RRPTIN  RRPTCTA  RET_NMP  RET_PCP  STUFACR  \
0     54.0     6.0     0.0      6.0      2.0     33.0     18.0   
1     86.0    42.0     0.0     42.0     20.0     48.0     20.0   
2     50.0     NaN     NaN      NaN      NaN      NaN     13.0   
3     82.0     8.0     0.0      8.0      2.0     25.0     19.0   
4     62.0    19.0     0.0     19.0      1.0      5.0     15.0   

   Higher_Than_Avg_Ret_Rate  Ret_Index  
0                     False  -1.003257  
1                      True   1.359392  
2                     False  -1.298588  
3                      True   1.064060  
4                     False  -0.412595  
In [108]:
pd.pivot_table(en2, 
               index = 'SECTOR LABEL', 
               values = ['RET_PCF', 'Ret_Index'],
               aggfunc = np.mean).round(3)
Out[108]:
RET_PCF Ret_Index
SECTOR LABEL
Private for-profit, 2-year 70.000 0.178
Private for-profit, 4-year or above 65.196 -0.177
Private for-profit, less-than 2-year 73.500 0.436
Private not-for-profit, 2-year 100.000 2.393
Private not-for-profit, 4-year or above 71.477 0.287
Public, 2-year 56.870 -0.791
Public, 4-year or above 73.740 0.454
Public, less-than 2-year 81.000 0.990
In [109]:
pd.pivot_table(en2, 
               index = 'SECTOR LABEL', 
               values = ['RET_PCF', 'Ret_Index'],
               aggfunc = np.mean).round(3)
Out[109]:
RET_PCF Ret_Index
SECTOR LABEL
Private for-profit, 2-year 70.000 0.178
Private for-profit, 4-year or above 65.196 -0.177
Private for-profit, less-than 2-year 73.500 0.436
Private not-for-profit, 2-year 100.000 2.393
Private not-for-profit, 4-year or above 71.477 0.287
Public, 2-year 56.870 -0.791
Public, 4-year or above 73.740 0.454
Public, less-than 2-year 81.000 0.990
In [110]:
piv_df = pd.pivot_table(en2, 
               index = 'SECTOR LABEL', 
               values = ['RET_PCF', 'Ret_Index'],
               aggfunc = np.mean).round(3)
In [111]:
piv_df
Out[111]:
RET_PCF Ret_Index
SECTOR LABEL
Private for-profit, 2-year 70.000 0.178
Private for-profit, 4-year or above 65.196 -0.177
Private for-profit, less-than 2-year 73.500 0.436
Private not-for-profit, 2-year 100.000 2.393
Private not-for-profit, 4-year or above 71.477 0.287
Public, 2-year 56.870 -0.791
Public, 4-year or above 73.740 0.454
Public, less-than 2-year 81.000 0.990
In [112]:
piv_df.reset_index()
Out[112]:
SECTOR LABEL RET_PCF Ret_Index
0 Private for-profit, 2-year 70.000 0.178
1 Private for-profit, 4-year or above 65.196 -0.177
2 Private for-profit, less-than 2-year 73.500 0.436
3 Private not-for-profit, 2-year 100.000 2.393
4 Private not-for-profit, 4-year or above 71.477 0.287
5 Public, 2-year 56.870 -0.791
6 Public, 4-year or above 73.740 0.454
7 Public, less-than 2-year 81.000 0.990

Saving to an Excel File¶

In [113]:
writer = pd.ExcelWriter('Retention Report.xlsx', engine = 'xlsxwriter')
piv_df.reset_index().to_excel(writer, sheet_name = 'Pivot_Table', index=False)
In [114]:
workbook = writer.book
worksheet = writer.sheets['Pivot_Table']
In [115]:
(max_row, max_col) = piv_df.shape

chart = workbook.add_chart({'type': 'column'})

chart.add_series({'values': ['Pivot_Table', 1, 1, max_row, 1]})
worksheet.insert_chart(1,3, chart)

format1 = workbook.add_format()
format1.set_align('left')
format2 = workbook.add_format({'num_format': '0.0%'})
format3 = workbook.add_format({'num_format': '0.0'})

worksheet.conditional_format('C2:C9', {'type': '3_color_scale'})

worksheet.set_column(0, 0, 35, format1)
worksheet.set_column(1, 1, 18, format3)
worksheet.set_column(2, 2, 18, format2)

writer.save()

Putting it all together¶

In [116]:
def get_the_data():
    
    df = pd.read_html('https://nces.ed.gov/programs/digest/d21/tables/dt21_302.10.asp?current=yes')
    df2 = df[3]
    
    return df2
In [117]:
def clean_data(df2):
    
    cols_to_keep = []
    for c in df2.columns:
        if '.1' not in c[4]:
            cols_to_keep.append(c)
            
    df3 = df2[cols_to_keep]
    
    df3.columns = df3.columns.map(lambda x: x[3] + ' - ' + x[2])
    df4 = df3[df3['Year - Year'] >= 2010]

    df4['2-year college - Total'] = df4['2-year college - Total'].astype('float')
    df4['4-year college or university - Total'] = df4['4-year college or university - Total'].astype('float')
    df4['2-year college - Males'] = df4['2-year college - Males'].astype('float')
    df4['4-year college or university - Males'] = df4['4-year college or university - Males'].astype('float')
    df4['2-year college - Females'] = df4['2-year college - Females'].astype('float')
    df4['4-year college or university - Females'] = df4['4-year college or university - Females'].astype('float')
    
    return df4
In [118]:
def save_data(df4):
    
    writer = pd.ExcelWriter('High School Data.xlsx', engine = 'xlsxwriter')
    df4.to_excel(writer, sheet_name = 'Sheet1', index=False)
    
    workbook = writer.book
    worksheet = writer.sheets['Sheet1']
    
    format1 = workbook.add_format({'num_format': '0'})
    format2 = workbook.add_format({'num_format': '0.0'})

    worksheet.conditional_format('E2:M12', {'type': '3_color_scale'})
    
    worksheet.set_column(0, 0, 14)
    worksheet.set_column(1, 1, 25, format1)
    worksheet.set_column(2, 2, 25, format1)
    worksheet.set_column(3, 3, 25, format1)
    worksheet.set_column(4, 12, 18, format2)
    
    writer.save()
    
In [119]:
df = get_the_data()
df2 = clean_data(df)
save_data(df2)
In [120]:
def get_the_data():
    
    df = pd.read_html('https://nces.ed.gov/programs/digest/d21/tables/dt21_302.10.asp?current=yes')
    df2 = df[3]
    
    return df2

def clean_data(df2):
    
    cols_to_keep = []
    for c in df2.columns:
        if '.1' not in c[4]:
            cols_to_keep.append(c)
            
    df3 = df2[cols_to_keep]
    
    df3.columns = df3.columns.map(lambda x: x[3] + ' - ' + x[2])
    df4 = df3[df3['Year - Year'] >= 2010]

    df4['2-year college - Total'] = df4['2-year college - Total'].astype('float')
    df4['4-year college or university - Total'] = df4['4-year college or university - Total'].astype('float')
    df4['2-year college - Males'] = df4['2-year college - Males'].astype('float')
    df4['4-year college or university - Males'] = df4['4-year college or university - Males'].astype('float')
    df4['2-year college - Females'] = df4['2-year college - Females'].astype('float')
    df4['4-year college or university - Females'] = df4['4-year college or university - Females'].astype('float')
    
    return df4
    
def save_data(df4):
    
    writer = pd.ExcelWriter('High School Data.xlsx', engine = 'xlsxwriter')
    df4.to_excel(writer, sheet_name = 'Sheet1', index=False)
    
    workbook = writer.book
    worksheet = writer.sheets['Sheet1']
    
    format1 = workbook.add_format({'num_format': '0'})
    format2 = workbook.add_format({'num_format': '0.0'})

    worksheet.conditional_format('E2:M12', {'type': '3_color_scale'})
    
    worksheet.set_column(0, 0, 14)
    worksheet.set_column(1, 1, 25, format1)
    worksheet.set_column(2, 2, 25, format1)
    worksheet.set_column(3, 3, 25, format1)
    worksheet.set_column(4, 12, 18, format2)
    
    writer.save()
    
df = get_the_data()
df2 = clean_data(df)
save_data(df2)

What's Next?¶

  • Using Task Manager to Run Python Scripts on a Schedule 🔥
  • Using Python and SQL together 💪
  • Grabbing Data via APIs 📮
  • Emailing Reports 🚀

Questions? 👏 😀 🔥 💗¶

image-2.png